Blog by Edo Frederix edofrederix@gmail.com RSS

Matlab Fast SOAP

August 20, 2011

Abstract

Matlab comes with intrinsic SOAP message handling functions. These rely on a DOM approach. While flexible and robust, the DOM approach has undesirable scaling for large XML documents. To avoid this, I have written a set of replacement functions.

For Simple Object Access Protocol (SOAP) document handling, Matlab has three embedded functions that all rely on a Java DOM object:

  • createSoapMessage(): This function takes data structured in a cell object, and places this data in the SOAP object. For every insertion, it has to iterate through all previously inserted elements, to find the right position. It is not capable of building a set of elements and inserting all these elements at once into the SOAP object.
  • callSoapService(): This function takes the generated SOAP object, converts the object into a Java class string and sends out the bytes to the web service. Upon a successful upload, the web service will give its response - whatever that may be. After a finished transaction, the callSoapService function will store the resulting document.
  • parseSoapResponse(): Take the response and parse all body elements into a new DOM object. This function, as well as createSoapMessage(), will individually select an element and insert it into the object. For every next element, it has to iterate both through document and object.

Of course, the default Matlab way of doing this is the obvious choice - and the right choice. People often use SOAP messages for relatively small data transactions. Having a dynamic approach here is far more important than gaining one or two milliseconds.

However, things change when the size of the data transaction increases from a few strings or numbers into thousands of database values. Having the capability of receiving and sending large sets of data in the academic field is crucial, as simulations and calculations are often assosiated with millions of points. Efficiency now becomes more important than flexibility. For example, creating a message cointaining 4096*3 XML elements will take around 10 seconds. Using parseSoapResponse to ingest a SOAP message containg 16384 velocity gradient tensors will never finish within reasonable time. The need for a faster approach is born.

On my Github page I have hosted a project called Matlab-Fast-SOAP. This project aims to create replacement functions for the routines mentioned above, that are as fast as possible, without losing too much flexibility. Through using simple string operations like fprintf() and regexp(), Matlab-Fast-SOAP is able to create XML documents with thousands of tags, and ingest large XML responses. In the README of this project, there is some more information on how to work with Matlab-Fast-SOAP. The README also contains an example run, with stunning results. Where Matlab takes around 10 seconds to create a 4096*3 element XML document, Matlab-Fast-SOAP only needs 0.6 seconds to create 589824 elements. That's around 50 times faster, and infinite times faster compared to a job that will never finish. As you can see, the new bottle neck in process time is located on web service side, and is outside our control.

Fetching 9 velocity gradient components for 256x256 points (589824 XML entries)
Creating the SOAP message: 0.630183 s
Sending SOAP message and receiving SOAP response: 21.680725 s
Parsing SOAP response (into responseStruct): 3.404503 s