Skip to content
DevTools Feed
Explainers New Releases DevOps & Platform Eng Open Source
Cloud & Infrastructure AI Dev Tools Databases & Backend Frontend & Web Engineering Culture

#web-scraping

Screenshot of a command line interface showing the brandmd tool running and extracting font information.
AI Dev Tools

Font Flubs: 1 in 5 Sites Misidentify Primary Typeface

A new tool designed to feed design systems to AI coding agents is running into a surprising data quality problem: 1 in 5 websites can't even correctly identify their own primary font. This isn't just an edge case; it's a systemic issue.

6 min read 2 days, 19 hours ago
A stylized graphic showing data flowing from search engine logos into a structured JSON code block.
DevOps & Platform Eng

Search API Shifts: JSON Results Unlock New Workflows

The humble search result page is getting a seismic shake-up. Forget scraping headaches; structured JSON data is the new frontier for developers.

6 min read 1 week, 3 days ago
n8n workflow nodes connecting to AlterLab scraping API for automated data extraction
DevOps & Platform Eng

n8n Meets AlterLab: The No-Fail Recipe for Automated Web Scraping Pipelines

Web scraping shouldn't be a cat-and-mouse game with bots. n8n paired with AlterLab turns it into a set-it-and-forget-it pipeline, dodging defenses while dumping clean data into your DB.

5 min read 1 month, 1 week ago
Diagram of MCP server bridging AI agents to Apify Korean web scrapers
AI Dev Tools

REST to MCP: Supercharging AI Agents with Korean Web Scrapers

Imagine AI agents effortlessly querying Korean businesses on Naver— no API wrangling required. One dev's MCP server just made that real, wrapping 13 scrapers into AI-native tools.

4 min read 1 month, 2 weeks ago
Scrapy pipeline diagram with rs-trafilatura extracting clean text from HTML
Open Source

Scrapy's New Best Friend: rs-trafilatura Pipeline Tears Through HTML Junk

Scrapy spiders spew raw HTML like a firehose of garbage. rs-trafilatura cleans it up, Rust-fast, right in your pipeline—no more manual parsing hell.

4 min read 1 month, 2 weeks ago
Code terminal displaying rs-trafilatura extraction results from Firecrawl scrape
AI Dev Tools

rs-trafilatura + Firecrawl: The Web Scraping Duo That Thinks Like a Journalist

Imagine scraping the web not as a blunt hammer, but a scalpel with confidence ratings. rs-trafilatura supercharges Firecrawl, turning raw HTML into gold-standard extracts.

4 min read 1 month, 2 weeks ago
Benchmark table showing rs-trafilatura outperforming Trafilatura and neural extractors on F1 score and speed
Open Source

rs-trafilatura Fixes Web Scraping's Dirty Secret: Non-Article Pages Finally Extract Right

Scraping the web just got smarter. rs-trafilatura classifies page types first, pulling clean content from forums and products that trip up every other tool—saving devs hours in RAG pipelines and SEO audits.

5 min read 1 month, 2 weeks ago
Node.js code screenshot showing Zappos category scraper output with price diffs
DevOps & Platform Eng

Scraping Zappos Weekly: From Chaotic Spot-Checks to Ruthless Price Audits

Growth teams waste hours on one-off scrapes. This Node.js blueprint turns them into automated weekly intel bombs, revealing competitor moves before they sting.

4 min read 1 month, 2 weeks ago
Playwright code snippet for GDPR-compliant business profile scraper
Databases & Backend

Scraping Legally: Playwright's GDPR Blueprint for 2026

Web scraping doesn't have to end in EU fines. Playwright makes GDPR compliance feasible — if you're disciplined.

4 min read 1 month, 2 weeks ago
Visualization of scraped Instagram comments dataset from Apify tool
Open Source

Million Instagram Comments Scraped: Apify's Hack Cracks Meta's Vault

Meta locks away Instagram comments like state secrets. Apify's scraper busts in, delivering a million at dirt-cheap rates— but don't get too cozy.

4 min read 1 month, 2 weeks ago
Line chart of competitor job posting spikes predicting enterprise pivot
AI Dev Tools

Scraping Rival Careers Pages for 6 Months: The Job Signals That Beat Market Research

Job postings spill secrets competitors hide in earnings calls. Six months of automated scraping revealed fundraises, tech rewrites, and upmarket shifts — all for under $5 a month.

4 min read 1 month, 2 weeks ago
Python code for scraping login-protected websites using requests session
DevOps & Platform Eng

Forget Selenium: Scrape Login Sites with Python Requests Alone

Selenium's the go-to for login-protected scraping, but it's a dinosaur—slow, hungry, and bot-bait. Here's how plain requests flips the script for most sites.

4 min read 1 month, 2 weeks ago
Page 1 of 2 Older →

Categories

Explainers New Releases DevOps & Platform Eng Open Source Cloud & Infrastructure AI Dev Tools Databases & Backend Frontend & Web
DevTools Feed

Ship faster. Build smarter.

More

  • RSS Feed
  • Sitemap
  • About
  • Editorial Process
  • Advertise

Legal

  • Privacy
  • Terms
  • Work With Us

Our Network

The AI Catchup AI & Machine Learning Threat Digest Cybersecurity Legal AI Beat Legal Tech Fintech Rundown Finance & Banking DevTools Feed Developer Tools Open Source Beat Open Source Fintech Dose Crypto & DeFi Chip Beat Semiconductors AdTech Beat Ad Technology Supply Chain Beat Logistics

© 2026 DevTools Feed. All rights reserved.

🏠Home 🔍Search 🔖Saved 📂Categories
Privacy & cookies

We use a privacy-respecting analytics tool to count page views — no personal profiles, no ad tracking, no third-party cookies. Accept to help us understand which stories matter to readers.

Details