How To Get A Page Source In Html?
Solution 1:
Since there is no true answer to this yet, I will post the method I use - I do not know if it's the best way - but it works.
The reason XHR may not be the best idea is because it's not always going to give you the exact source of a certain tab - this way will.
content.js
chrome.extension.onRequest.addListener(function(request, sender, callback)
{
if (request.action == 'getSource')
{
callback(document.documentElement.outerHTML);
}
});
background.html
chrome.tabs.sendRequest(tab.id, {action : 'getSource'}, function(source) {
console.log(source);
});
Solution 2:
As asked for, here is the source:
<!DOCTYPE htmlPUBLIC"-//W3C//DTD XHTML 1.0 Transitional//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><htmlxmlns="http://www.w3.org/1999/xhtml"><headid="Head1"><metahttp-equiv="Content-Type"content="text/html; charset=UTF-8" /><title>
Faculty of Media Engineering and Technology (MET) - The German University in Cairo
</title><!--[if gte IE 7 ]>
<!--><linktype="text/css"href="Media/ResourceHandler.ashx?v=1&fileSet=homepage_css&type=text/css"rel="Stylesheet" /><scripttype="text/javascript"src="Media/ResourceHandler.ashx?v=1&fileSet=homepage_script&type=application/x-javascript"></script>
<![endif]-->
</head><bodyonload="init();"><formname="ctl00"method="post"action="Default.aspx"onsubmit="javascript:return WebForm_OnSubmit();"id="ctl00"><div><inputtype="hidden"name="ScriptManager1_HiddenField"id="ScriptManager1_HiddenField"value="" /><inputtype="hidden"name="__EVENTTARGET"id="__EVENTTARGET"value="" /><inputtype="hidden"name="__EVENTARGUMENT"id="__EVENTARGUMENT"value="" /><inputtype="hidden"name="__VIEWSTATE"id="__VIEWSTATE"value="/wEPDwUKMTYwMjE2MTE1MA9kFgICAw9kFgICBw8WAh4LXyFJdGVtQ291bnQCAhYEZg9kFggCAg8VAgdOZXdzXzE3GFNtYXJ0U29mdCBhcmUgdGhlIGNoYW1wc2QCAw8PFgIeCEltYWdlVXJsBS1+L1JlcG9zaXRvcnkvTmV3c0NvbXBvbmVudC9TT3JpZ2luYWxGaW5hbC5qcGdkZAIEDxUB8QFBZnRlciBjb21wZXRpbmcgYWdhaW5zdCBDU0VOIGFuZCBCSSBjb21wYW5pZXMuIFNtYXJ0U29mdCBtYW5hZ2VkIHRvDQp3aW4gdGhlIFNvZnR3YXJlIEVuZ2luZWVyaW5nIENvbXBldGl0aW9uIGZvciBTcHJpbmcgMjAxMCBhZnRlciBkZXZlbG9waW5nIGFuIG91dHN0YW5kaW5nIG9ubGluZSB0b29sIGZvciBhdXRvbWF0aW5nIGFnaWxlIHNvZnR3YXJlIG1hbmFnZW1lbnQuIENvbmdyYXR1bGF0aW9ucyB0byBTbWFydFNvZnQhZAIFDxYCHgVzdHlsZQUNZGlzcGxheTpub25lOxYCZg8VAQBkAgEPZBYIAgIPFQIGTmV3c18xHE1lZGlhIEVuZ2luZWVyaW5nIGF0IHRoZSBHVUNkAgMPDxYCHwEFJn4vUmVwb3NpdG9yeS9OZXdzQ29tcG9uZW50L2xpYnJhcnkuanBnZGQCBA8VAfEBTWVkaWEgRW5naW5lZXJpbmcgYW5kIFRlY2hub2xvZ3kgYWltcyBhdCB0aGUgZXZvbHZpbmcgZmllbGQgb2YgbmVhcmx5IGFsbCBhc3BlY3RzIG9mIGluZm9ybWF0aW9uIGFuZCBtdWx0aW1lZGlhIHByb2Nlc3NpbmcuIFRoZSBzdHVkeSBwcm9ncmFtIGluICJNZWRpYSBFbmdpbmVlcmluZyBhbmQgVGVjaG5vbG9neSIgcmVzdHMgb24gdGhlIHNhbWUgZnVuZGFtZW50YWxzIGFzIGZvciBJbmZvcm1hdGlvbiBUZWNobm9sb2d5LmQCBQ9kFgJmDxUBE0Fib3V0L1Byb2dyYW1zLmFzcHhkGAMFHl9fQ29udHJvbHNSZXF1aXJlUG9zdEJhY2tLZXlfXxYCBR1Mb2dpblVzZXJDb250cm9sMSRsb2dpbkJ1dHRvbgUkTG9naW5Vc2VyQ29udHJvbDEkUmVtZW1iZXJNZUNoZWNrYm94BRxMb2dpblVzZXJDb250cm9sMSRNdWx0aVZpZXcxDw9kZmQFNkxvZ2luVXNlckNvbnRyb2wxJEhvbWVwYWdlVG9vbHNNZW51Q29udHJvbDEkTXVsdGlWaWV3MQ8PZGZk++EYs51/1WiGabXN2nlBpWq7B38=" /></div><scripttype="text/javascript">
//<![CDATA[
var theForm = document.forms['ctl00'];
if (!theForm) {
theForm = document.ctl00;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
</script><scriptsrc="/WebResource.axd?d=hjWzicBH57aDEOAXMpQVJQ2&t=633566901938396560"type="text/javascript"></script><scriptsrc="/ScriptResource.axd?d=H0761Oq7Alukyw82KELp8-Txl2kQFm7sZfTkrcnjSDzxZz0PrQZLm48rbx9Jm7dI_LMT2zH0QUfg9RJVLEsm7Q2&t=633566901938396560"type="text/javascript"></script><scriptsrc="/ScriptResource.axd?d=9NKqPW-jeqHS98DhHZ6Iy5ulSdcD3uOEBYcWmPxYVzi01PBdj_S7yBr5N-59MNCSkIHANMTKEfgCCoAEWIDetGqltgG2yF0m6QP4thTRHlI1&t=633432692861214540"type="text/javascript"></script><scriptsrc="/ScriptResource.axd?d=9NKqPW-jeqHS98DhHZ6Iy5ulSdcD3uOEBYcWmPxYVzi01PBdj_S7yBr5N-59MNCSkIHANMTKEfgCCoAEWIDetB9rztfIh11Bb3t4nicyu881&t=633432692861214540"type="text/javascript"></script><scriptsrc="/Default.aspx?_TSM_HiddenField_=ScriptManager1_HiddenField&_TSM_CombinedScripts_=%3b%3bAjaxControlToolkit%2c+Version%3d1.0.10920.32880%2c+Culture%3dneutral%2c+PublicKeyToken%3d28f01b0e84b6d53e%3aen-US%3a816bbca1-959d-46fd-928f-6347d6f2c9c3%3a9ea3f0e2%3ae2e86ef9%3a9e8e87e9%3a1df13a87%3a4c9865be%3aba594826%3a757f92c2"type="text/javascript"></script><scripttype="text/javascript">//<![CDATA[functionWebForm_OnSubmit() {
if (typeof(ValidatorOnSubmit) == "function" && ValidatorOnSubmit() == false) returnfalse;
returntrue;
}
//]]></script><scripttype="text/javascript">
//<![CDATA[
Sys.WebForms.PageRequestManager._initialize('ScriptManager1', document.getElementById('ctl00'));
Sys.WebForms.PageRequestManager.getInstance()._updateControls([], [], [], 90);
//]]>
</script><!-- Page Container --><divid="container"><!-- Header and Menu --><divid="headerAndMenu"><!-- Title --><h1id="logo"><ahref="http://www.guc.edu.eg"target="_blank"><imgsrc="Media/Images/HomePage/Logo.png.ashx"alt="The German University in Cairo" /></a></h1><h2id="title">
Faculty of Media Engineering and Technology</h2><!-- Title --><!-- Menu --><divid="menu"><divid="mainPart"><divid="aboutMET"onmouseenter="opacity('aboutMETSubMenu',0,100,5)"onmouseleave="opacity('aboutMETSubMenu',100,0,5)"><divid="aboutMETSubMenu"><ulid="aboutMETSubMenuList"><liid="programs"><aclass="main"href="About/Programs.aspx">Programs</a></li><liid="degrees"><aclass="main"href="About/Degrees.aspx">Degrees</a></li><liid="ourPeople"><aclass="main"href="People/">OurPeople</a></li><liid="admission"><aclass="main"href="About/Admission.aspx">Admission</a></li></ul></div></div><divid="academics"onmouseenter="opacity('academicsSubMenu',0,100,5)"onmouseleave="opacity('academicsSubMenu',100,0,5)"><divid="academicsSubMenu"><ulid="academicsSubMenuList"><liid="underGraduate"><aclass="main"href="Courses/Undergrad.aspx">
Undergraduate Courses </a></li><liid="graduate"><aclass="main"href="Courses/Grad.aspx">
Graduate Courses</a></li><liid="courseCatalogue"><aclass="main"href="Courses/">Course
Catalogue</a></li><liid="research"><aclass="main"href="Research/">Research</a></li></ul></div></div><divid="extras"onmouseenter="opacity('extrasSubMenu',0,100,5)"onmouseleave="opacity('extrasSubMenu',100,0,5)"><divid="extrasSubMenu"><ulid="extrasSubMenuList"><liid="activities"><aclass="main"href="Activities/">Activities</a></li><liid="onlineTutorials"><aclass="main"href="OnlineTutorials/"
>Online Tutorials</a></li><liid="staffBlog"><aclass="main"href="#">Staff Blog</a></li><liid="showCase"><aclass="main"href="#">Showcase</a></li><liid="forum"><aclass="main"href="Forum/">Forum</a></li></ul></div></div><divid="agenda"onmouseenter="opacity('agendaSubMenu',0,100,5)"onmouseleave="opacity('agendaSubMenu',100,0,5)"><divid="agendaSubMenu"><ulid="agendaSubMenuList"><liid="announcements"><aclass="main"href="Agenda/Announcements.aspx">Announcements</a></li><liid="calendar"><aclass="main"href="Agenda/">Calendar</a></li><liid="policies"><aclass="main"href="About/Policies.aspx">Policies</a></li></ul></div></div></div></div><!-- /Menu --></div><!-- /Header and Menu --><!-- Content --><divid="content"><!-- Login --><divid="login"><divclass="homePageLoginDiv"><divid="Div1"style="position: relative; top: 5px;"><div><divclass="tools-menu-header"id="login_label"><imgstyle="border-style: none; vertical-align: middle; padding-right: 5px;"src="Media/Icons/key_go.png.ashx"><spanclass="label">Login</span></div><divclass="tools-menu-body"id="tools-menu-div"><label>
GUC Email
</label><divclass="leftTBoxSide"></div><div><inputname="LoginUserControl1$usernameTextBox"type="text"id="LoginUserControl1_usernameTextBox"class="userNameTBox" /></div><divclass="rightTBoxSide"></div><spanid="LoginUserControl1_LoginEmailRequiredFieldValidator"style="color:Red;display:none;"></span><spanid="LoginUserControl1_LoginEmailFormatValidator"style="color:Red;display:none;"></span><label>
Password</label><divclass="leftTBoxSide"></div><div><inputname="LoginUserControl1$passwordTextBox"type="password"id="LoginUserControl1_passwordTextBox"class="passwordTBox" /></div><divclass="rightTBoxSide"></div><spanid="LoginUserControl1_LoginPasswordRequiredFieldValidator"style="color:Red;display:none;"></span><inputtype="image"name="LoginUserControl1$loginButton"id="LoginUserControl1_loginButton"class="loginBtn"src="Media/Images/HomePage/goButton.gif.ashx"onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("LoginUserControl1$loginButton", "", true, "", "", false, false))"style="border-width:0px;" /><spanclass="checkbox"><inputid="LoginUserControl1_RememberMeCheckbox"type="checkbox"name="LoginUserControl1$RememberMeCheckbox" /></span><labelclass="checkbox_label"for="LoginUserControl1_RememberMeCheckbox">
Remember me</label><aid="LoginUserControl1_forgotPasswordButton"class="forgotPasswordBtn"href="javascript:__doPostBack('LoginUserControl1$forgotPasswordButton','')">Forgot password?</a><spanstyle="margin-right: 3px;">Student?</span><ahref="Student/RegisterStudent.aspx">Register</a></div></div></div></div></div><!-- /Login --><!-- Search --><divid="search"></div><!-- /Search --><!-- News --><divid="news"><!-- News Glider--><divid="newsGlider"><divid="previousDiv"><imgsrc="Media/Images/HomePage/prev.png.ashx"id="previous"alt="Previous"onclick="my_glider.previous();return false;" /></div><divclass="scroller"><divclass="content"><inputtype="hidden"name="newsRepeater$ctl00$idHdnField"id="newsRepeater_ctl00_idHdnField"value="17" /><divclass="section"id='News_17'><h2class="newsTitle">
SmartSoft are the champs
</h2><imgid="newsRepeater_ctl00_Image1"class="newsImage"alt="MET Stories"src="Repository/NewsComponent/SOriginalFinal.jpg"style="border-width:0px;" /><pclass="newsParagraph">
After competing against CSEN and BI companies. SmartSoft managed to
win the Software Engineering Competition for Spring 2010 after developing an outstanding online tool for automating agile software management. Congratulations to SmartSoft!
</p><divid="newsRepeater_ctl00_morelink"style="display:none;"><aclass="newsLink"href=""target="_blank">more</a></div></div><inputtype="hidden"name="newsRepeater$ctl01$idHdnField"id="newsRepeater_ctl01_idHdnField"value="1" /><divclass="section"id='News_1'><h2class="newsTitle">
Media Engineering at the GUC
</h2><imgid="newsRepeater_ctl01_Image1"class="newsImage"alt="MET Stories"src="Repository/NewsComponent/library.jpg"style="border-width:0px;" /><pclass="newsParagraph">
Media Engineering and Technology aims at the evolving field of nearly all aspects of information and multimedia processing. The study program in "Media Engineering and Technology" rests on the same fundamentals as for Information Technology.
</p><divid="newsRepeater_ctl01_morelink"style="display: block;"><aclass="newsLink"href="About/Programs.aspx"target="_blank">more</a></div></div></div></div><imgsrc="Media/Images/HomePage/next.png.ashx"id="next"alt="Next"onclick="my_glider.next();return false;" /><!--<div id="nextDiv"></div> --></div></div><!-- /News --><!-- Footer --><divid="footer"><h5class="right"><ahref="Feeds/RSS.aspx"><imgsrc="Media/Icons/rss.png.ashx"alt="RSS"style="border-style: none; position: relative;
top: 3px; padding-right: 2px;" /><b>RSS</b> Feeds</a><ahref="Credits/robusta.aspx">
Credits</a></h5><h5class="left">
Copyright © 2008 GUC. All Rights Reserved.</h5></div><!-- /Footer --></div><!-- /Content --></div><!-- /Page Container --><!-- Extra Divs --><!-- /Extra Divs --><divid="glider_script"><scripttype="text/javascript"charset="utf-8">var my_glider = newGlider('newsGlider', {duration:1.0, autoGlide:true, frequency:15});
</script></div><scripttype="text/javascript">//<![CDATA[varPage_Validators = newArray(document.getElementById("LoginUserControl1_LoginEmailRequiredFieldValidator"), document.getElementById("LoginUserControl1_LoginEmailFormatValidator"), document.getElementById("LoginUserControl1_LoginPasswordRequiredFieldValidator"));
//]]></script><scripttype="text/javascript">//<![CDATA[varLoginUserControl1_LoginEmailRequiredFieldValidator = document.all ? document.all["LoginUserControl1_LoginEmailRequiredFieldValidator"] : document.getElementById("LoginUserControl1_LoginEmailRequiredFieldValidator");
LoginUserControl1_LoginEmailRequiredFieldValidator.controltovalidate = "LoginUserControl1_usernameTextBox";
LoginUserControl1_LoginEmailRequiredFieldValidator.errormessage = "Email required.";
LoginUserControl1_LoginEmailRequiredFieldValidator.display = "None";
LoginUserControl1_LoginEmailRequiredFieldValidator.evaluationfunction = "RequiredFieldValidatorEvaluateIsValid";
LoginUserControl1_LoginEmailRequiredFieldValidator.initialvalue = "";
varLoginUserControl1_LoginEmailFormatValidator = document.all ? document.all["LoginUserControl1_LoginEmailFormatValidator"] : document.getElementById("LoginUserControl1_LoginEmailFormatValidator");
LoginUserControl1_LoginEmailFormatValidator.controltovalidate = "LoginUserControl1_usernameTextBox";
LoginUserControl1_LoginEmailFormatValidator.errormessage = "Must be in the form of user@student.guc.edu.eg OR user@guc.edu.eg";
LoginUserControl1_LoginEmailFormatValidator.display = "None";
LoginUserControl1_LoginEmailFormatValidator.evaluationfunction = "RegularExpressionValidatorEvaluateIsValid";
LoginUserControl1_LoginEmailFormatValidator.validationexpression = "\\w+([-+.\']\\w+)*@(student.)?guc.edu.eg";
varLoginUserControl1_LoginPasswordRequiredFieldValidator = document.all ? document.all["LoginUserControl1_LoginPasswordRequiredFieldValidator"] : document.getElementById("LoginUserControl1_LoginPasswordRequiredFieldValidator");
LoginUserControl1_LoginPasswordRequiredFieldValidator.controltovalidate = "LoginUserControl1_passwordTextBox";
LoginUserControl1_LoginPasswordRequiredFieldValidator.errormessage = "Password required.";
LoginUserControl1_LoginPasswordRequiredFieldValidator.display = "None";
LoginUserControl1_LoginPasswordRequiredFieldValidator.evaluationfunction = "RequiredFieldValidatorEvaluateIsValid";
LoginUserControl1_LoginPasswordRequiredFieldValidator.initialvalue = "";
//]]></script><scripttype="text/javascript">
<!--
varPage_ValidationActive = false;
if (typeof(ValidatorOnLoad) == "function") {
ValidatorOnLoad();
}
functionValidatorOnSubmit() {
if (Page_ValidationActive) {
returnValidatorCommonOnSubmit();
}
else {
returntrue;
}
}
// --></script><scripttype="text/javascript">//<![CDATA[Sys.Application.initialize();
Sys.Application.add_init(function() {
$create(AjaxControlToolkit.ValidatorCalloutBehavior, {"closeImageUrl":"/WebResource.axd?d=E9XUtTpBpgn1nrqvm7JdsQNAiTSrs01kvYMfJ6_c6indZV0XUSo9nn5ewbqAXA5hefaIKnoyXSIFnFPdZX8u_dwAMV0u0RfKJgPDjFETh3g1&t=633887970297152468","highlightCssClass":"invalidInput","id":"LoginUserControl1_LoginEmailRequiredFieldValidatorExtender","warningIconImageUrl":"/WebResource.axd?d=E9XUtTpBpgn1nrqvm7JdsQNAiTSrs01kvYMfJ6_c6indZV0XUSo9nn5ewbqAXA5hpBwM9IEYB0L_JPlcVCV_StBVa8rc0SgI1L1ARCQ2e4o1&t=633887970297152468"}, null, null, $get("LoginUserControl1_LoginEmailRequiredFieldValidator"));
});
Sys.Application.add_init(function() {
$create(AjaxControlToolkit.ValidatorCalloutBehavior, {"closeImageUrl":"/WebResource.axd?d=E9XUtTpBpgn1nrqvm7JdsQNAiTSrs01kvYMfJ6_c6indZV0XUSo9nn5ewbqAXA5hefaIKnoyXSIFnFPdZX8u_dwAMV0u0RfKJgPDjFETh3g1&t=633887970297152468","highlightCssClass":"invalidInput","id":"LoginUserControl1_LoginEmailFormatValidatorExtender","warningIconImageUrl":"/WebResource.axd?d=E9XUtTpBpgn1nrqvm7JdsQNAiTSrs01kvYMfJ6_c6indZV0XUSo9nn5ewbqAXA5hpBwM9IEYB0L_JPlcVCV_StBVa8rc0SgI1L1ARCQ2e4o1&t=633887970297152468"}, null, null, $get("LoginUserControl1_LoginEmailFormatValidator"));
});
document.getElementById('LoginUserControl1_LoginEmailRequiredFieldValidator').dispose = function() {
Array.remove(Page_Validators, document.getElementById('LoginUserControl1_LoginEmailRequiredFieldValidator'));
}
document.getElementById('LoginUserControl1_LoginEmailFormatValidator').dispose = function() {
Array.remove(Page_Validators, document.getElementById('LoginUserControl1_LoginEmailFormatValidator'));
}
document.getElementById('LoginUserControl1_LoginPasswordRequiredFieldValidator').dispose = function() {
Array.remove(Page_Validators, document.getElementById('LoginUserControl1_LoginPasswordRequiredFieldValidator'));
}
//]]></script></form><scripttype="text/javascript">var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script><scripttype="text/javascript">var pageTracker = _gat._getTracker("UA-6040050-1");
pageTracker._trackPageview();
</script></body></html>
You can usually get the source for any website by hitting ctrl+u (at least for Chrome)
Solution 3:
You need to read up on XHR. See here: http://code.google.com/chrome/extensions/xhr.html This will let you load the contents of http://met.guc.edu.eg into a variable.
Then you you need to read up on regexp, which would let you extract the information that you want.
It is almost impossible to give a full answer without actually doing it.
You may find it easier to load the content in an Iframe that you control the dimensions / scroll of.
Post a Comment for "How To Get A Page Source In Html?"