Monday, June 3, 2024
 Popular · Latest · Hot · Upcoming
65
rated 0 times [  69] [ 4]  / answers: 1 / hits: 20077  / 13 Years ago, mon, january 9, 2012, 12:00:00

I'll explain my question with an example.
Suggest I go the the url:
http://www.google.co.il/#q=university



and then I right click and choose view source, I don't get the real html source,
I'm sure of that because if I search in the code unique words that appear in the document I get no results.



I know that in chrome I can mark something and check the component, then I can see the real source code, but I want to use a java program for getting the code so I want to understand the issue of why I don't see the real html source when I go to 'view source'.


More From » html

 Answers
15

Well, if you select view source you see the actual HTML source code of the page in your address bar. However, it might be that the page(s) you want to view are obfuscated by having embedded code which loads external content and puts it in your HTML.



If you still want to automatically parse such a page in a nice you need to run a whole HTML interpreter like for example Webkit - a hell of work, and in principle what you are doing with inspect element. The other way is that you find the lines in the page-html that load the external contents and then in turn load them on your own. If you are lucky this is not obfuscated on purpose and kind of easy to achive for small tasks.



However, if you need the whole DOM structure, you should think about implementing one of the browser engines...


[#88140] Sunday, January 8, 2012, 13 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
ninaemiliaj

Total Points: 405
Total Questions: 112
Total Answers: 112

Location: Gabon
Member since Sat, Jul 25, 2020
4 Years ago
;