One of the things I talk to the community about a lot and always get a good response to is the black-art of web scraping. As the world becomes more and more hungry for data, it seems that more often than not, the data we want can be found on a website somewhere. But how to get that data so you can change it the way you want and do something interesting with it and add value to it? This is where web scraping comes in.
Web scraping involves two core skills - obviously, some programming knowledge to get started, and the other arguably more important, is the ability to know how a website is built, to be able to identify where the data you want actually comes from. Is the data you need embedded on the page? (really?) or is it in a CSS file, a tangled up mess encoded in Java-script, or being generated based on a JSON feed from an AJAX call to an API somewhere. Web scraping at the start is less about programming and more about the investigation - this is something that a lot of folks who start to web scrape miss.
One of my most popular series of articles is all about web-scraping - sometimes, however, the text is not enough, and we need visual help to see what's going on. With this in mind, I have created a free webscraping course that teaches the fundamentals of web scraping - it will take you from knowing little about the subject to having a solid knowledge of what you need to do when you start web scraping.
Most of the resources (both free and paid) I have come across assume the reader/student has more knowledge than they do - my aim with this course is to give you that important foundation knowledge, of course, for free :)
I am really enjoying the process of making courses (especially free ones) as a new way to pass skills and knowledge to others in the community - here is some of the amazing feedback I have received so far - I am delighted with it and it is the support of the community over the years that has allowed me to get to this stage.
"Finally, an instructor who knows how to teach students so that they are not lost or confused. Thank you!"
"Great course, wish I would have watched it before reading Python Scrapy documentation."
"I loved this course. It's a great introduction to web scraping! It's short, sweet and to the point!"
"Really impressed about theory and strategies to use... goes beyond my expectation"
If you want to learn the very fundamentals of web scraping - this is a great place to start, and it's free, and we like free:
Allen is a consulting architect with a background in enterprise systems. His current obsessions are IoT, Big Data and Machine Learning. When not chained to his desk he can be found fixing broken things, playing music very badly or trying to shape things out of wood. He runs his own company specializing in systems architecture and scaling for big data and is involved in a number of technology startups.
Allen is a chartered engineer, a Fellow of the British Computing Society, and a Microsoft MVP. He writes for CodeProject, C-Sharp Corner and DZone. He currently completing a PhD in AI and is also a ball throwing slave for his dogs.