4umi.com/web/javascript/readurl

Read-an-url

Form Javascript

Reading whole webpages into a Javascript variable living on another webpage has been many developer's dream for a long time. The available scripting languages just weren't up to it. Workarounds used hidden frames and required fancy features like outerHTML that many browsers didn't offer. Today's browsers have the XMLHttpRequest() object, but they are still limited to the same-origin policy that prohibits accessing url's that are not at the same domain as the page running the script.

Fallback on Java

An applet is the solution for browsers that will not support XMLHttp connections. Applets are small applications written in Java (another computer language, no family of Javascript in any direct way), and the Java Virtual Machine runs seemingly inside the page, but totally independent of the browser program and its limitations. The old-fashioned way, no longer supported by current standards, has the <applet> tag to embed little Java programs in a webpage:

<applet name="readurl" code="ReadURL.class" width="400" height="50">
Sorry, failed loading applet.
</applet>

 Large Coffee at Night, 
 woodcut, 12" x 14" 
 from www.khalily.com (The name of the file is its signature. It must appear with this name and capitalization ReadURL.class, otherwise objects will not be created, resulting in a fatal error in the applet.) However, HTML 4.0 deprecates the <applet> tag in favor of the <object> tag, which has a few additional attributes and is supposed to replace many other tags, even <img>, in the future. Older browsers do not know about it, so the applet is unlikely to disappear anytime soon. To accomodate both old and new, use the applet tag as content-to-be-ignored of the object tag. Pages containing an applet tag will not pass the validity check at validator.w3.org but they will work.

There is no need for it to be visible, but for demonstration purposes, the above tag looks like this on the page, wrapped in a bordered div element to aid visualization:

Sorry, failed loading applet.

Java can do things that Javascript has never heard of, and accessing remote machines just like that is one of them. This particular .class does exactly that and nothing else. It has a public method readFile() which reads the content of the chosen file into a public variable fileContent. When finished, the applet sets another variable finished from 0 to 1. No checks for the existence of the requested document are made, the applet even relies on Javascript for the front-end as you might have guessed, where the user enters an url to read, and receives the response:

Request a test

Please choose an address and the applet will try to download it into the output area. Relative url's are based on the current location.

RequestSelect:
Response

The script

The example is kept simple. More advanced uses of these remote requests in real-world situations are not hard to imagine. The handling of the received data is only bound to the programmer's imagination.

function tojava( url ) {
 document.forms.appletform.elements.txt.value = '\n\tReading ' + url + '...';
 document.applets.readurl.readFile( url );
 window.setTimeout( 'fromjava();', 60 );
}

function fromjava() {
 if( document.applets.readurl.finished ) {
  document.forms.appletform.elements.txt.value = document.applets.readurl.fileContent;
 } else {
  window.setTimeout( 'fromjava();', 60 );
 }
}

if( ( i = window.location.search.match( /url=([^&]*)/ ) ) && ( i = i[1] ) ) {
 tojava( unescape( i ) );
}

There is no news under the sun.

Reference