Skip to content Skip to sidebar Skip to footer

Fetching The Website With Jsoup - Page View Source And Jsoup Shows Different Content

I use Jsoup to scrap the website: doc = Jsoup.connect(String.valueOf(urls[0])).userAgent('Mozilla').get(); Here is the link: http://www.yelp.com/search?find_desc=restaurant&am

Solution 1:

Short answer Jsoup can't execute the Javascript.

Long answer

http://www.yelp.com/search?find_desc=restaurant&find_loc=willowbrook%2C+IL&ns=1#l=p:IL:Willowbrook::&sortby=rating&rpp=40

The webpage your are looking for accepts the Http Get with the parameters. In the normal browser it accepts the params and loads the page . But Not with willowbrook checked(in your example). It loads the JS after it loads the page and the Javascript does the check box for Fliters the serach results. Therefore when you use Jsoup you are getting more results because it loads 'state=IL' without 'willowbrook' filtered.

Post a Comment for "Fetching The Website With Jsoup - Page View Source And Jsoup Shows Different Content"