HttpClient PHP class
6th April 2003
I’ve been working in quite a roundabout fashion recently. My principle target is to build a collaborative blogging system. As part of this, I needed an RSS aggregator to allow a single blog to show the most recent entries from a number of other, related blogs. Then I needed a way of downloading RSS feeds from external sites. While thinking about this (although to be fair it’s pretty much a solved problem) I was inspired to build something that could cache whole sites. And that lead me to need a PHP HTTP client class for retriving information from the web. So I wrote one of those :)
HttpClient is similar in some ways to Snoopy (which I have been using and recommending for years) but takes a different approach and includes some interesting new features. Firstly, while Snoopy contains a bunch of code for parsing HTML (to extract forms, links and the link) HttpClient concentrates purely on the HTTP side of things, leaving HTML parsing to other classes. Secondly, HttpClient supports gzip encoding. And finally, HttpClient is designed to be used multiple times in a single session, and will store and resend cookies and referral information between requests.
The HttpClient site has example code, a manual and a demo which shows the client accessing Amazon.com with debug mode turned on.
As an aside, I learnt a couple of useful things about HTTP while putting the class together, both of them from reading comments in the PHP Manual. Firstly, HTTP 1.1 is best avoided from a scripting point of view—it requires support for chunked encoding if you want to avoid random hex added to your content, and provides no practical advantages over HTTP 1.0 (cookies / gzip encoding and the all important Host:
header work just fine without it). Secondly, if you want to uncompress gzip encoded content from an HTTP response you need to remove the first 10 characters before running the gzinflate()
function or it will fail with a mysterious error.
More recent articles
- Slop is the new name for unwanted AI-generated content - 8th May 2024
- Weeknotes: more datasette-secrets, plus a mystery video project - 7th May 2024
- Weeknotes: Llama 3, AI for Data Journalism, llm-evals and datasette-secrets - 23rd April 2024
- Options for accessing Llama 3 from the terminal using LLM - 22nd April 2024
- AI for Data Journalism: demonstrating what we can do with this stuff right now - 17th April 2024
- Three major LLM releases in 24 hours (plus weeknotes) - 10th April 2024
- Building files-to-prompt entirely using Claude 3 Opus - 8th April 2024
- Running OCR against PDFs and images directly in your browser - 30th March 2024
- llm cmd undo last git commit - a new plugin for LLM - 26th March 2024
- Building and testing C extensions for SQLite with ChatGPT Code Interpreter - 23rd March 2024