Using the Apache HttpClient in a JavaScript Stage

Generally when you're indexing data with your web connector, you're not going to have to worry about authentication, or adding custom headers to your request.  Sometimes, however -- say when you need to handle special authentication schemes -- you need to take control of the request.  

This can, of course, be accomplished with a bit of JavaScript.  To that end I've put together a sample of spinning up an Apache HttpClient in JavaScript stage:

function (doc) {
var BufferedReader = java.io.BufferedReader;
var InputStreamReader = java.io.InputStreamReader;
var userAgent = org.apache.http.HttpHeaders.USER_AGENT;
var HttpResponse = org.apache.http.HttpResponse;
var HttpClient = org.apache.http.client.HttpClient;
var HttpGet = org.apache.http.client.methods.HttpGet;
var HttpClientBuilder = org.apache.http.impl.client.HttpClientBuilder;
var StringBuffer = java.lang.StringBuffer;
var String = java.lang.String;
var e = java.lang.Exception;

result = new StringBuffer();
        try {
           // you may get this from the PipelineDocument doc.getId(); 
            var url = new String( "http://www.google.com/search?q=httpClient");

            var client = HttpClientBuilder.create().build();
            var request = new HttpGet(url);

             // add request header
            request.addHeader("User-Agent", userAgent);
var response = client.execute(request); logger.info("RESPONSE Code : " + response.getStatusLine().getStatusCode()); var rd = new BufferedReader(new InputStreamReader(response.getEntity().getContent())); result = new StringBuffer(); var line = ""; while ((line = rd.readLine()) !== null) { result.append(line); } } catch (e) { logger.error(e); } logger.info(result); return doc; }

Breaking it down: 

Really, all we're doing here is making a simple request.  This could just as easily be a POST, PUT and so on.  We can add as many headers as required, and handle the result in some usable -- or re-usable -- fashion.  

For example, let's say that we wanted to authenticate and create a session, and then re-use that authentication down the line.   We could take our 'result' object, which in the case of an authentication request may be a persistent cookie, or some such.  We could take that result and persist the cooke in the system properties:

    
          var System = java.lang.System;
          System.getProperties().put("persistentCookie", result);
 

Now we've saved off our session id, and can reuse it down the line for subsequent requests.  

    
           var System = java.lang.System;
           var cooke = System.getProperties().get("persistentCookie");
           request.addHeader("Cookie", cookie);
 

This way we can re-use our initially authenticated session.

There are many potential edge cases that may require you to take control of the requests being made.  This is just one.  Just keep in mind that you can do pretty much anything you need to do in a JavaScript stage, and in this instance, you have the full capabilities of the Apache HttpClient at your disposal. 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk