Facebook Messenger
This notebook shows how to load data from Facebook in a format you can fine-tune on. The overall steps are:
- Download your messenger data to disk.
- Create the Chat Loader and call
loader.load()
(orloader.lazy_load()
) to perform the conversion. - Optionally use
merge_chat_runs
to combine message from the same sender in sequence, and/ormap_ai_messages
to convert messages from the specified sender to the "AIMessage" class. Once you've done this, callconvert_messages_for_finetuning
to prepare your data for fine-tuning.
Once this has been done, you can fine-tune your model. To do so you would complete the following steps:
- Upload your messages to OpenAI and run a fine-tuning job.
- Use the resulting model in your LangChain app!
Let's begin.
1. Download Data
To download your own messenger data, following instructions here. IMPORTANT - make sure to download them in JSON format (not HTML).
We are hosting an example dump at this google drive link that we will use in this walkthrough.
# This uses some example data
import zipfile
import requests
def download_and_unzip(url: str, output_path: str = "file.zip") -> None:
file_id = url.split("/")[-2]
download_url = f"https://drive.google.com/uc?export=download&id={file_id}"
response = requests.get(download_url)
if response.status_code != 200:
print("Failed to download the file.")
return
with open(output_path, "wb") as file:
file.write(response.content)
print(f"File {output_path} downloaded.")
with zipfile.ZipFile(output_path, "r") as zip_ref:
zip_ref.extractall()
print(f"File {output_path} has been unzipped.")
# URL of the file to download
url = (
"https://drive.google.com/file/d/1rh1s1o2i7B-Sk1v9o8KNgivLVGwJ-osV/view?usp=sharing"
)
# Download and unzip
download_and_unzip(url)
File file.zip downloaded.
File file.zip has been unzipped.