Hacker News API: The Complete Guide to Algolia Search and Firebase Data

Hacker News API guide hero image

I wasted an entire afternoon once writing a recursive crawler to fetch HN comment threads one-by-one from Firebase before someone pointed out that Algolia already returns the whole tree in a single call. Classic.

That's the thing about the Hacker News API — there are actually two of them, and most people only know about one. Which one you pick first shapes how you think about the problem, and sometimes you pick wrong. So here's the cheat sheet I wish I'd had.

Two APIs, One Hacker News

The official Firebase API (hacker-news.firebaseio.com) is run by HN itself. Raw access to items, users, and live feeds. It's the canonical source — when you need exactly one item by its ID or a real-time stream of what's hitting the front page right now, Firebase is the move.

Then there's the Algolia Search API (hn.algolia.com/api/v1). Algolia crawls and indexes everything on HN and slaps a full-text search layer on top. Date filters, tag filters, pagination, the works. Anytime I need to find stuff rather than fetch a known thing, I reach for Algolia.

Quick decision tree:

Searching or filtering content? Algolia, no contest.
Grabbing one item you already have the ID for? Either works. Firebase is slightly more straightforward.
Real-time front page / new story stream? Firebase — it has dedicated live endpoints and supports SSE (Server-Sent Events).
Loading an article's full comment thread? Algolia, unless you enjoy writing recursive fetch loops. (I did not enjoy it.)

The Algolia Hacker News Search API

No auth, no API key, completely free. Two search endpoints with one difference between them — sort order.

By relevance:

GET http://hn.algolia.com/api/v1/search?query=YOUR_QUERY

By date (newest first):

GET http://hn.algolia.com/api/v1/search_by_date?query=YOUR_QUERY

Honestly, I almost always use search_by_date. When I'm tracking competitor mentions or watching how a topic evolves week over week, I want chronological. The relevance ranking is fine for one-off searches but it mixes old viral posts with recent stuff in ways that muddy the picture.

Parameters Worth Knowing

query — Full-text search. You know the drill.
tags — This one's more powerful than it looks. You can pass story, comment, ask_hn, show_hn, poll, or front_page. Combine them for AND logic: tags=story,author_pg gives you stories posted by pg. Wrap in parens for OR: tags=story,(author_pg,author_dang).
numericFilters — My favorite. Filter on points, num_comments, or created_at_i (Unix timestamp). Something like numericFilters=points>100,num_comments>50 weeds out noise fast.
page — Zero-indexed. Combine with hitsPerPage (max 1000) for pagination.

Real Example

All stories about "vector databases" from the past month, with at least 10 points:

curl "http://hn.algolia.com/api/v1/search_by_date?\
query=vector+databases&\
tags=story&\
numericFilters=created_at_i>$(date -d '30 days ago' +%s),points>10"

You get back a JSON blob with a hits array — each hit has title, url, author, points, num_comments, created_at, objectID (the HN item ID), and a few other fields. nbHits at the top level tells you the total match count across all pages.

Getting the Full Comment Thread

GET http://hn.algolia.com/api/v1/items/:id

Hands down the most useful Algolia endpoint. Give it any HN item ID and it hands back the item plus the entire comment tree as nested children objects. One HTTP request, done. I use this constantly — find an interesting story via search, then pull the thread to see what people actually said.

The Firebase Hacker News API

Firebase is bare-bones by design. No search, no filtering. Just clean REST endpoints returning JSON. No auth needed here either.

Items

GET https://hacker-news.firebaseio.com/v0/item/\{id\}.json

Stories, comments, polls, jobs — they're all "items" with an integer ID. You get back type, by, time, text, url, score, title, and kids. That kids field is where things get annoying: it's just an array of child item IDs, not the actual child objects. Want the comments on a 200-comment story? That's 200 separate fetches. You can parallelize them, sure, but it's still ugly compared to Algolia's single-call approach.

Users

GET https://hacker-news.firebaseio.com/v0/user/\{username\}.json

Gives you id, created, karma, about, and submitted. The submitted array is every item they've ever posted — stories, comments, everything. Can get large for prolific users.

The Live Feeds

This is honestly the killer feature of the Firebase API. Dedicated endpoints for different story rankings:

/v0/topstories.json — Top 500 right now
/v0/newstories.json — Newest 500
/v0/beststories.json — Best 500
/v0/askstories.json — Ask HN
/v0/showstories.json — Show HN
/v0/jobstories.json — Job posts

Each one returns an array of item IDs. What makes this really interesting is Firebase's native support for SSE streaming — instead of polling every 30 seconds, you open a persistent connection and get notified when the list changes. I've used this to build dashboards that update in real time. Pretty slick for a free API with zero setup.

Patterns I Keep Coming Back To

The search-then-fetch combo. Algolia search to find relevant stories, then Algolia /items/:id to grab the full discussion. This two-step pattern covers maybe 80% of what I build.

The mention monitor. Cron job that hits search_by_date with your company name (or a competitor's), filtered to stories only. Stash the objectID of everything you've already processed. New IDs = new mentions = Slack notification. This is how you find out about HN threads about your product before your CEO forwards them to you in a panic. If you'd rather skip the cron job plumbing, a news intelligence monitor does this out of the box — watches HN, Reddit, and tech news for keywords you care about and sends you a digest.

User enrichment. Spot an interesting commenter through Algolia, then pull their Firebase profile. High karma + years of activity = someone whose opinion carries weight. New account + combative comments = maybe don't engage.

Trend tracking. Run the same search_by_date query with rolling weekly time windows via numericFilters on created_at_i. Count results per window, throw it in a chart. Rough but surprisingly effective for tracking developer interest in a technology.

What People Actually Build With This

The raw APIs work great for scripts and prototypes. When things get more serious, you start running into the usual annoyances — pagination bookkeeping, deduplication across runs, timestamp wrangling.

Competitive intelligence is the big one I see. Developer tools companies tracking every HN mention of their competitors, flagging threads where products get compared side-by-side, pulling comment sentiment. HN is probably the highest-signal developer community out there for this kind of thing. People don't sugarcoat.

Content research is another. Before writing a technical post, I'll search HN for the topic and skim the top threads. What got traction, what got ripped apart in the comments, what angles nobody covered yet. The comments are often more valuable than the stories themselves.

Recruiting intelligence. The monthly "Who's Hiring" threads are a goldmine of structured job data if you parse them. Which companies are growing, which stacks are in demand, which cities are popping up more often.

Launch tracking. Show HN posts are basically public product announcements to a savvy technical audience. Following these in your vertical gives you weeks of lead time over waiting for TechCrunch or Product Hunt.

Skip the Plumbing

If stitching API calls together isn't how you want to spend your Tuesday, Cotera's Hacker News tool packages both APIs behind a single interface that AI agents can call directly. Search stories, pull articles with their full comment trees, look up users — without writing pagination logic or managing endpoint URLs. Useful when HN data is one input among many in a larger research workflow and you'd rather not maintain a standalone scraper for it.

No keys, no auth. Just tell your agent to go look something up on Hacker News.

Try These Agents

News Intelligence Monitor -- Track HN, Reddit, and tech news mentions of your brand, competitors, or any keyword
Reddit Research Deck -- Build research reports from Reddit and HN discussions on any topic
Brand Monitoring Agent -- Monitor mentions across HN, Twitter, Reddit, and news sites with Slack delivery

Hacker News API: The Complete Guide to Algolia Search and Firebase Data

Hacker News API: The Complete Guide to Algolia Search and Firebase Data

Two APIs, One Hacker News

The Algolia Hacker News Search API

Parameters Worth Knowing

Real Example

Getting the Full Comment Thread

The Firebase Hacker News API

Items

Users

The Live Feeds

Patterns I Keep Coming Back To

What People Actually Build With This

Skip the Plumbing

Try These Agents

For people who think busywork is boring