A Reddit Archiver with Firebase Functions and Firestore
Part of the Encounters London Project is to archive the LondonR4R feed because community members need confidence in the feed, i.e. people posting are who they say they are. We can achieve this by looking at a particular poster's history, checking how often they are posted, and looking for inconsistencies.
As Encounters London looks to be a part of LondonR4R and its own thing, we can build tools to work for one and, as a result, becomes useful for the other as they share a purpose.
The MVP
Purpose
To create a tool that will build confidence in the LondonR4R community.
Vision
A way to collect and display the previous posts of LondonR4R.
Problem
As a community member in LondonR4R, I want to meet people for mutual and consensual experiences, but I don't know if the people behind the posts are who they say they are because it is a place where people can keep their anonymity. Therefore, I feel engaging with people is risky and has a low chance of success.
As a moderator of LondonR4R, I need to deliver moderation in the form of having as genuine posts as possible, but in many instances, I cannot see the previous posts of a member because if it is removed from Reddit, then it’s like it never existed. Therefore, I feel there is never a clear picture of every member and how they post.
Validation
An archive has already been thrown together at https://r4r-companion.herokuapp.com/archive/ and has proven to be extremely useful in the moderation tasks of LondonR4R. But Heroku is removing their free tier, and scalability is tough.
Goals and Success
I want to be able to have every post archived as it gets posted as a post URL and author so that everyone can review what came before from that person.
I want to deploy in a way that I can utilise Firebase to create it as a feature of Encounters London and use it as a part of the go-to-market strategy.
Work to be Done
Stage 1: Familiarisation with Firebase Functions
As I have not used Google’s Firebase Functions before, I will have to dip my toe in and see how to create a basic function.
First, I will go to the Firebase console and create a new Project to test in. I create under my organisation and give a name; with that, I also turn off Google Analytics.
From the Project view in the navigation menu, I selected Functions under Build. The next window prompts me to say I must upgrade my plan, so I did that.
From clicking Continue, I am taken to adding payment info in the Google Cloud Platform, and I add my Business payment details and confirm. Then I am taken back to Firebase to set a payment limit, I just put in £100 for now, and clicked continue.
Ok, now I am back to Firebase Function, and I click get started, and I am prompted to install firebase-tools via npm. But, from previous work, I already have it installed. I checked my install by running:
Clicking continue leads me to the instructions on how to deploy, prompted to go into the directory to work in … so I’ll create that and open it in VSCode.
I started by logging in and initialising the project.
There are some prompts in the flow of the command line interface.
Chose an existing project and selected the one I had just created
What language would you like to use to write Cloud Functions? JavaScript
Do you want to use ESLint to catch probable bugs and enforce style? Yes
Do you want to install dependencies with npm now? Yes
Everything will be downloaded into my project folder, so I have some files to start with.
The main source file for my Cloud Functions code is in the index.js file. If we take a peek, we can see the main import for cloud functions and some commented-out ‘hello, world’ sample code. So I will uncomment it and deploy it; thanks for thinking about the development experience, Google.
And run in the terminal:
Now back in the Console for Firebase > Function, I see a new function from my deployment.
I grabbed the URL from the terminal output for Firebase deploy and popped that in my browser. (https://us-central1-reddit-archiver-dev.cloudfunctions.net/helloWorld)
Stage 2: Passing data throug a POST request
As data comes in from the subreddit, which will be from IFTTT, then I can send all metadata, e.g. author, date, etc. via a POST method. The first experiment is just adding a body to the logs to see what is parsed in the function. The line to log is updated to:
In POSTMAN, the URL is requested with a body of a key:value pair.
This leads to a result in the logs that displays the data from line changed.
From here, I will remove the second parameter and parse the JSON so I can pull specific data out.
Stage 3: Getting some data from Reddit
As mentioned, I will be utilising IFTTT. I am currently using it and having success with ingressing data into the current archiver.
Firstly I create an IFTTT applet for Reddit, with “test” as the subreddit. (https://www.reddit.com/r/test/)
I then create with the following setting for “that”:
I’ve also changed the code to what it previously was and deployed to check what output I get.
I then created a post in the subreddit.
Going back to Firebase Functions the Reddit post shows up!
As things are working as expected, I’ll update the IFTTT applet with additional metadata.
I updated the code to log everything being sent and deploy it.
Now I create another post in r/test.
And in Firebase > Function > Logs …
Hoorah! But a couple of things to note. Firstly, the URL has ?utm_source-ifttt and secondly, the date is a string, which could get awkward to parse.
Stage 4: Pushing the data to Firestore
Now I have an API and automation that will get data from every new post in a subreddit, but that needs to be stored so I will use Firestore to do so.
I started by creating a Firestore Database in Firebase with “test” options for rules. I then created a collection called “test-archive” and added a document with one field.
In my code, I’ve added the Firebase Admin SDK to access Firestore.
My function was updated by firstly changing it to an async function. I then added a document written with test data, and then finally returned the document ID just created.
Finally I kept getting the following error:
To fix this, I added parserOptions to .eslintrc.js for ecmaVersion: 8. The whole file now looks like the following:
From here, deployment works fine. Next, I set up Postman to match up to the API with some test data.
And after clicking send …
Now I create a JSON object with the field names and the value from the keys provided in the call. Also, I move the logging further down to capture the document ID. Lastly, I commented out the date, as I don’t think it would be useful in its current form.
Now I create another post in r/test and update the IFTTT to remove the posted_at ingredient. And after all of that:
Stage 5: Little improvements
Things are working, but just a couple of changes to make. I removed the IFTTT source in the URL and added a timestamp as now for when it is archived.
Moving Forward
This experiment has worked out nicely and has given me the confidence to deploy it into production. The next step is to deploy for r/LondonR4R and get the feed automatically archived. This will lead to Encounters London being the place to go to view the previous posts in the feed as part of the go-to-market strategy for this product and, therefore, will build the interface in Flutter.