What is WebScraping

Web Scraping



Many websites maintain their own API when they decided to share the information with public like Google, Cymon.io , StackOverflow, Twitter and so on. Few API's are accessible by everyone while few are accessible using a special key given for every registered individuals. The data being provided by these API are in structured format and can be easily be captured by users at a single sight.

Not all the websites hosted in the world maintain their API's to share the information.

But what if you need data from a website that does not maintain any API?? Here comes the idea of WEB SCRAPING. A small piece of something, especially one that is left over after the greater part has been used is called the scrap.(Definition from Wiki)

When we extract the useful information from a WEBsite , such a process of extraction is called WEB SCRAPING. Usually the process includes the conversion of the unstructured data (hosted on website) into structured data (extracted useful data)

Python can be used for scraping the web. It includes the libraries like Urllib2(fetches the URL) , BeautifulSoup(Extracts the information required) and many

One can get the clear idea of above mentioned libraries from their documentation.

LINK: Urllib2 Docuumentation

LINK:BeautifulSoup

Comments

Popular

Traversal In A Binary Tree - Tree -3

Pre Order Traversal In Binary Tree

Tree data structure - 2