How to make your own free news site

Anythingmachine

If you click on the 'News' page in this blog that link will send you to the Anythingmachine news site. That is, by far, one of my favorite and most used (by me) tools I've ever built. I've used a cultivated list of RSS feeds through various interfaces (google reader, various desktop clients, and most recently, Feedly) for quite a while now and have never been 100% happy with any of the solutions for accessing them. I've considered trying to work on something closer to what I consider ideal for a while but only recently decided to put any real effort into it. I recently spent a couple of days hacking together something I enjoyed using, was easily shareable (one of the issues I've had with many options) and best of all, free to run!

Feedme (Seymour)

I'm not very UI-oriented, so I wanted something very simple. I decided to strictly focus on the presentation aspect, and that a read-only format or one that required manual updates was fine. Once I made that decision I figured putting it on a web page would probably be the easiest solution. After extracting my site list from Feedly, I started out experimenting with using the Python feedparser module to pull down some sample entries and generate text output generally formatted how I wanted it. Once I had something serviceable I added some headers and footers around the content and a few HTML tags to improve overall legibility. I added a sprinkle of CSS, and a favicon, and had something that didn't look too bad.

Next, I needed to figure out how to limit the output to only the more recent posts. I won't lie, one of my least favorite things to do in Python is date logic, so instead of spending a lot of time hammering on this, I decided to let my old friend SQLite handle much of this for me. I ended up learning about the wonderful dateutil module to take the highly inconsistent date/time format from most of the feeds and normalize it. Once that was working (I won't lie, there are still a few hacks involved), I inserted the output into an SQLite DB. From there, I just needed to query the DB descending by date(time) and limit the number of results to the number of posts I want to show on the site. I was now able to easily grab N number of the most current articles with very little code involved.

Once I started adding more posts I ran into an unexpected issue. I didn't want to let the summaries run on forever, so I decided to truncate them at a certain character length. I started seeing strange artifacts partway through the rendered page and realized some of the summaries had opened HTML tags which, due to truncation, were never being closed. This led to interesting things like half the page suddenly being printed in bold, as a HEADER, or in italics. I did a little research and found that the BeautifulSoup module had out-of-the-box support for just this situation. I had a method to build a page, so now the question is, how do I automate and serve this thing?

GitHub Actions & Pages

I have.. too many ways to serve things honestly. I had an idea in the back of my head that I really, really kind of wanted to run this thing for 'free'. I've been thinking about this problem space mostly in the context of the Fediverse (nope, haven't solved that yet, past letting others shoulder the expense for me) and I think it just carried over to this project. I very easily could have posted this on my Digital Ocean droplet and generated the page via cron, but I've also been playing around with GitHub Actions lately and that, combined with Pages, seemed like it might work. While I'm using SQLite, it's effectively a cache so the fact that it gets deleted between runs doesn't matter. I was able to leverage a few existing GitHub Actions images to put together something that would trigger my script, generate the HTML file, and check it into my repo which is hosted by GitHub Pages. I did a little math to figure out the frequency I thought I could run this and still keep within the free tier of Actions, and, voici, I now had a news site that would be regenerated every 4 hours and completely free for me to host!

Get out there and make some news

I use the site almost every day, sometimes multiple times a day, and love the fact that I can easily share it with anyone. Anythingmachine is mostly tech news, but depending on your taste there are LOTS of feeds to tap into to make a personal internet front page. I'd highly encourage everyone to build their site and share it. In a world where there is so much turmoil and distrust around news, there's no better editor than you.

Code

Everything discussed here is available at: https://github.com/looprock/looprock.github.io/