Dom Maurice

View Original

A Reddit Archiver with Firebase Functions and Firestore

Part of the Encounters London Project is to archive the LondonR4R feed because community members need confidence in the feed, i.e. people posting are who they say they are. We can achieve this by looking at a particular poster's history, checking how often they are posted, and looking for inconsistencies.

As Encounters London looks to be a part of LondonR4R and its own thing, we can build tools to work for one and, as a result, becomes useful for the other as they share a purpose.

The MVP

Purpose

To create a tool that will build confidence in the LondonR4R community.

Vision

A way to collect and display the previous posts of LondonR4R.

Problem

As a community member in LondonR4R, I want to meet people for mutual and consensual experiences, but I don't know if the people behind the posts are who they say they are because it is a place where people can keep their anonymity. Therefore, I feel engaging with people is risky and has a low chance of success.

As a moderator of LondonR4R, I need to deliver moderation in the form of having as genuine posts as possible, but in many instances, I cannot see the previous posts of a member because if it is removed from Reddit, then it’s like it never existed. Therefore, I feel there is never a clear picture of every member and how they post.

Validation

An archive has already been thrown together at https://r4r-companion.herokuapp.com/archive/ and has proven to be extremely useful in the moderation tasks of LondonR4R. But Heroku is removing their free tier, and scalability is tough.

Goals and Success

  • I want to be able to have every post archived as it gets posted as a post URL and author so that everyone can review what came before from that person.

  • I want to deploy in a way that I can utilise Firebase to create it as a feature of Encounters London and use it as a part of the go-to-market strategy.

Work to be Done

Stage 1: Familiarisation with Firebase Functions

As I have not used Google’s Firebase Functions before, I will have to dip my toe in and see how to create a basic function.

First, I will go to the Firebase console and create a new Project to test in. I create under my organisation and give a name; with that, I also turn off Google Analytics.

From the Project view in the navigation menu, I selected Functions under Build. The next window prompts me to say I must upgrade my plan, so I did that.

From clicking Continue, I am taken to adding payment info in the Google Cloud Platform, and I add my Business payment details and confirm. Then I am taken back to Firebase to set a payment limit, I just put in £100 for now, and clicked continue.

Ok, now I am back to Firebase Function, and I click get started, and I am prompted to install firebase-tools via npm. But, from previous work, I already have it installed. I checked my install by running:

See this content in the original post

Clicking continue leads me to the instructions on how to deploy, prompted to go into the directory to work in … so I’ll create that and open it in VSCode.

I started by logging in and initialising the project.

See this content in the original post

There are some prompts in the flow of the command line interface.

  • Chose an existing project and selected the one I had just created

  • What language would you like to use to write Cloud Functions? JavaScript

  • Do you want to use ESLint to catch probable bugs and enforce style? Yes

  • Do you want to install dependencies with npm now? Yes

Everything will be downloaded into my project folder, so I have some files to start with.

The main source file for my Cloud Functions code is in the index.js file. If we take a peek, we can see the main import for cloud functions and some commented-out ‘hello, world’ sample code. So I will uncomment it and deploy it; thanks for thinking about the development experience, Google.

See this content in the original post

And run in the terminal:

See this content in the original post

Now back in the Console for Firebase > Function, I see a new function from my deployment.

I grabbed the URL from the terminal output for Firebase deploy and popped that in my browser. (https://us-central1-reddit-archiver-dev.cloudfunctions.net/helloWorld)

And from the logs.

Stage 2: Passing data throug a POST request

As data comes in from the subreddit, which will be from IFTTT, then I can send all metadata, e.g. author, date, etc. via a POST method. The first experiment is just adding a body to the logs to see what is parsed in the function. The line to log is updated to:

See this content in the original post

In POSTMAN, the URL is requested with a body of a key:value pair.

This leads to a result in the logs that displays the data from line changed.

From here, I will remove the second parameter and parse the JSON so I can pull specific data out.

See this content in the original post

Stage 3: Getting some data from Reddit

As mentioned, I will be utilising IFTTT. I am currently using it and having success with ingressing data into the current archiver.

Firstly I create an IFTTT applet for Reddit, with “test” as the subreddit. (https://www.reddit.com/r/test/)

I then create with the following setting for “that”:

I’ve also changed the code to what it previously was and deployed to check what output I get.

See this content in the original post

I then created a post in the subreddit.

Going back to Firebase Functions the Reddit post shows up!

As things are working as expected, I’ll update the IFTTT applet with additional metadata.

See this content in the original post

I updated the code to log everything being sent and deploy it.

See this content in the original post

Now I create another post in r/test.

And in Firebase > Function > Logs …

Hoorah! But a couple of things to note. Firstly, the URL has ?utm_source-ifttt and secondly, the date is a string, which could get awkward to parse.

Stage 4: Pushing the data to Firestore

Now I have an API and automation that will get data from every new post in a subreddit, but that needs to be stored so I will use Firestore to do so.

I started by creating a Firestore Database in Firebase with “test” options for rules. I then created a collection called “test-archive” and added a document with one field.

In my code, I’ve added the Firebase Admin SDK to access Firestore.

See this content in the original post

My function was updated by firstly changing it to an async function. I then added a document written with test data, and then finally returned the document ID just created.

See this content in the original post

Finally I kept getting the following error:

See this content in the original post

To fix this, I added parserOptions to .eslintrc.js for ecmaVersion: 8. The whole file now looks like the following:

See this content in the original post

From here, deployment works fine. Next, I set up Postman to match up to the API with some test data.

And after clicking send …

Now I create a JSON object with the field names and the value from the keys provided in the call. Also, I move the logging further down to capture the document ID. Lastly, I commented out the date, as I don’t think it would be useful in its current form.

See this content in the original post

Now I create another post in r/test and update the IFTTT to remove the posted_at ingredient. And after all of that:

Stage 5: Little improvements

Things are working, but just a couple of changes to make. I removed the IFTTT source in the URL and added a timestamp as now for when it is archived.

See this content in the original post

Moving Forward

This experiment has worked out nicely and has given me the confidence to deploy it into production. The next step is to deploy for r/LondonR4R and get the feed automatically archived. This will lead to Encounters London being the place to go to view the previous posts in the feed as part of the go-to-market strategy for this product and, therefore, will build the interface in Flutter.