What’s actually in your Discord data package? I hope you're sitting down

Right, did this last weekend on a whim and reckon it’s worth sharing because most people I’ve spoken to assume Discord doesn’t keep much. WRONG. They keep heaps.

How to request: Settings > Privacy & Safety > Request all of my data. Tick “include all messages from servers I’m in” if you want the full picture (it’s off by default, which already tells you all you need to know lol). Took about three days for the email with the download link. The file came as a zip, mine was 1.2 GB which I wasn’t expecting cause I’m not a heavy user at all.

So, what’s inside?

top level:

  • account/ every email and password change, every login from every IP and device since I made the account in 2018
  • activity/ every voice channel I’ve ever joined, with timestamps. Every game launch Discord noticed (the rich presence stuff). Every link I’ve ever clicked from inside a Discord message
  • messages/ JSON files for every DM and every server I’ve been in. Includes edited message history, so the ‘before’ version of any message you’ve edited is still in there
  • servers/ every server I’ve joined or left, with timestamps
  • programs/ purchase history (nitro etc)
  • README.txt comically short given the volume of data

The bit that genuinely spooked me is themessages folder has my message history from servers I left years ago. Not just the messages themselves, but timestamps, channel names, and even the replies to me from other users. I’d assumed leaving a server cleaned that up…newp.

Worth requesting yours just to know what’s there. Programme is a JSON viewer of your choice for opening the message files (they’re not human-readable in a text editor, well, technically they are but it’s grim).

DO IT https://support.discord.com/hc/en-us/articles/360004027692-Requesting-a-Copy-of-your-Data

Glad someone finally posted this. The JSON structure for messages is actually really interesting once you start querying it.

Each message JSON has fields for: ID, timestamp, content, attachments with original CDN URLs that often still resolve even after the server is gone and edit history as an array of full prior versions. If you want to actually do something with the data instead of just stare at it, throw it in jq.

Counting your total messages in a channel: jq '. | length' messages.json

Finding the timestamp of your earliest message: jq '[.[].Timestamp] | sort | first' messages.json

Pulling messages from a specific year: jq '.[] | select(.Timestamp | startswith("2019"))' messages.json

I found DMs from people I haven’t spoken to in years and years that I had completely forgotten about.

the attachment URLs in the export are signed. Some still work, some have expired. Discord rolled out attachment URL expiration in 2024 so if your archive is older than that the URLs are baked in but the assets behind them might be gone.

OK I am doing this now because of this thread. Just got my zip file open and I’m in a weird headspace. There’s a folder of DMs with someone I haven’t talked to since 2019 because of a fight and i’m reading back through it I realized neither of us were actually wrong about what we were fighting over, we were both kind of right. Ten thousand messages in that DM too! I forgot we used to talkthat much. actually kind of makes me sad.

The data package isn’t just data it’s like finding your long lost online dirary from your younger years.

thank you for posting this. I think a lot of people are going to have feelings they didn’t expect when going through their discord history files.

it is worth adding the regulatory context for WHY the data package looks the way it does. Discord’s data retention is governed by their privacy policy + GDPR (for the EU users) and the various US state laws (CCPA, CPRA, and whatever). Their published policy says they retain message data “as long as your account is active” with no upper bound. Account deletion triggers a 14 to 30 day grace period, then a hard delete that supposedly purges the database.

The data package is essentially Discord complying with right of access requirements. What you can request is more or less what they’re legally required to give you, which is interesting because it means the existence of the package is itself a regulatory artifact. Before GDPR you couldn’t get this at all.

The things they collect that aren’t in the package are also worth knowing. Inferred attributes (estimated age, language, location at the city level), AI training derivatives if your messages were used in any modeling work, and aggregated analytics data are all generally excluded.

did this last month n cara my zip was 4.2 GB. I have been on discord since 2017 and I have apparently sent over 600,000 messages haha

the part that got me was the voice activity log shows every voice channel I joined ever

this is wild my zip is 800 MB and I’m not even what I’d call a power user.

Back when I was on AOL I had no concept that any of this stuff was being kept. You just typed and sent and that was that. Now I open this zip and there’s literally a folder telling me which links I clicked in 2020. Some of those links I’m sure I don’t even remember clicking, the data knows me better than I know myself.

The wife thinks I’m being paranoid for caring about it. Showed her the activity folder and she stopped saying that haha

oh nvm, apparently 800MB is nothing lol

if you just dl your data I would do the following:

audit what you’d want gone. The messages folder is your map for a Redact cleanup. You can see exactly which servers and DMs have your most sensitive history. run cleanup on the heaviest ones first before the others.

save anything you actually want to keep. Specifically images and files attached to DMs from people you care about. Those CDN URLs do expire, so if there’s something sentimental in there, archive it locally before you do anything else.

Don’t doomscroll your own past. Easy to get lost in here for hours. Set a time limit on the first read-through, then come back to it deliberately when you have a specific thing you want to look up. reading every old conversation in one sitting is the privacy equivalent of looking at every photo album you own back to back, it’s a lot.

The package is a tool, not a trip. Use it for cleanup, then archive it somewhere safe.

I’m scared to even know whats in mine

just downloaded mine on a Saturday afternoon and I have lost the entire afternoon. 9 GB. nine. gigabytes. of mostly gif reactions and one liners I sent in a 2019 server that no longer exists. if there’s a more humbling way to spend a Saturday I haven’t found it.

The IP address log is not just for securityEach IP can be linked to physical address through ISP records if law enforcement asks. Years of this in your data. Not theoretical.

In my country we learn early to assume everything is logged forever. Surprised when americans treat this as new information.

I downloaded mine specifically to see who I was DMing during a six month period in 2020 because I forgot something important. found it in twenty seconds. weirdest superpower that this archive exists.