2009年3月10日星期二

Enabling Peer-to-Peer BitTorrent Downloads with Azureus

by Jacobus Steenkamp
06/22/2007
The type of traffic distribution on the Internet today is quite different from the type you might have encountered only a few years ago. In the past, the vast majority of internet bandwidth was used to transfer character streams (in most cases HTML) over either HTTP or HTTPS. This trend has changed over the past few years, with a great deal of bandwidth (33 to 50 percent by some estimates) now being used to distribute large files over peer-to-peer connections. BitTorrent is one of the more popular protocols being used for peer-to-peer file transfers, and enabling your Java applications to use this protocol has never been easier.

Peer-to-peer networks rely primarily on the bandwidth and hardware of the participants in the network rather than on a relatively small set of centralized servers. It is, therefore, much cheaper in terms of bandwidth and energy costs for the content provider to distribute large files using a peer-to-peer network rather than through the traditional client-server approach. There are quite a few examples in the industry where peer-to-peer networking has already been taken advantage of:

Blizzard's World of Warcraft game uses the BitTorrent protocol to send game updates to clients
The BitTorrent protocol is often used to distribute free and open source software. Open Office and popular Linux Distributions often offer the option of downloading their software using BitTorrent.
The BBC has recently announced that it will be making hundreds of episodes available over peer-to-peer file sharing networks. By opting to use a peer-to-peer paradigm to distribute its content, the BBC can reach a large audience without the need to invest vast amounts of money in building a server infrastructure.
The BitTorrent Protocol
The BitTorrent Protocol, which was designed and first implemented by Bram Cohen in 2001, is arguably the most popular and efficient peer-to-peer protocol currently in use.

To start sharing a file (or set of files) using BitTorrent the first peer, or initial seeder, creates a torrent file that contains all the metadata information required by clients to start downloading the shared file. This typically includes the name of the shared file (or files), the number of pieces the file has been broken down into, the checksum of each of the pieces, and the location of the tracker server which serves as a central point that coordinates all the connected peers. Unlike the rest of the traffic in a BitTorrent peer group (or swarm), communication to the tracker server is usually performed over HTTP.

Given a torrent file, a BitTorrent client would typically start off by connecting to the tracker server and getting the details of all other peers on the network. It would then start requesting pieces of the shared file from the rest of the swarm and use the checksum values in the torrent file to validate the received data. This BitTorrent process is very nicely illustrated on Wikipedia.

Azureus
Due to the openness of the BitTorrent protocol, numerous compatible BitTorrent clients have been implemented in a variety of programming languages and computing platforms. Out of all the options out there Azureus, which is implemented using Java and SWT, has proven itself to be one of the more popular and feature rich clients available. In fact, Azureus is the second most downloaded application on the Alltime Top Downloads list on SourceForge. One can argue that Azureus's popularity probably makes it one of the most successfulconsumer targetted Java desktop applications in the world.

In addition to being a great BitTorrent client, Azureus also contains functionality to create torrent files, set up a tracker server and an initial seeder. In the rest of the article we will be looking at how you can leverage these features for use in your own applications and take advantage of the cost benefits that peer-to-peer file distribution offers.

Getting Started with the Azureus API: A Simple Torrent File Downloader
In this section, we are going to implement a simple command-line application based on the Azureus API (or engine) to download a data file using the BitTorrent protocol. The URL of the torrent file will be passed in at the command line.

public class SimpleStandaloneDownloader {
...
private static AzureusCore core;
...
public static void main(String[] args) throws Exception{

//Set the default root directory for the azureus engine.
//If not set, it defaults to the user's home directory.
System.setProperty("azureus.config.path", "run-environment/az-config");
...
String url = null;
...
url = args[0];
...
core = AzureusCoreFactory.create();
core.start();
...
System.out.println("Attempting to download torrent at : " + url);

File downloadedTorrentFile = downloadTorrentFile(new URL(url));

System.out.println("Completed download of : " + url);
System.out.println("File stored as : " + downloadedTorrentFile.getAbsolutePath());

File downloadDirectory = new File("downloads"); //Destination directory
if(downloadDirectory.exists() == false) downloadDirectory.mkdir();

//Start the download of the torrent
GlobalManager globalManager = core.getGlobalManager();
DownloadManager manager = globalManager.addDownloadManager(downloadedTorrentFile.getAbsolutePath(),
downloadDirectory.getAbsolutePath());

DownloadManagerListener listener = new DownloadStateListener();
manager.addListener(listener);
globalManager.startAllDownloads();

}
}
The singleton AzureusCore instance is the central axis on which the whole Azureus API revolves. After creating it (using the AzureusCoreFactory) and starting it, you are ready to start using its functionality. It should be noted that AzureusCore spawns its own Threads internally and generally runs asynchronously to the rest of the application.

After having downloaded the torrent file from the passed in URL using the downloadTorrentFile() method, the torrent is submitted to Azureus's GlobalManager instance, which is responsible for managing downloads. The DownloadManager that gets returned by the addDownloadManager() method can be used to retrieve a wealth of statistics on the download, including the data send rate and number of connected peers. In this example we have registered a DownloadManagerListener instance (implemented by the DownloadStateListener class) to track when the torrent data file has started downloading and to print out the completed percentage to the command line.

private static class DownloadStateListener implements DownloadManagerListener{
...
public void stateChanged(DownloadManager manager, int state) {
switch(state){
...
case DownloadManager.STATE_DOWNLOADING :
System.out.println("Downloading....");
//Start a new daemon thread periodically check
//the progress of the upload and print it out
//to the command line
Runnable checkAndPrintProgress = new Runnable(){

public void run(){
try{
boolean downloadCompleted = false;
while(!downloadCompleted){
AzureusCore core = AzureusCoreFactory.getSingleton();
List managers = core.getGlobalManager().getDownloadManagers();

//There is only one in the queue.
DownloadManager man = managers.get(0);
System.out.println("Download is " +
(man.getStats().getCompleted() / 10.0) +
" % complete");
downloadCompleted = man.isDownloadComplete(true);
//Check every 10 seconds on the progress
Thread.sleep(10000);
}
}catch(Exception e){
throw new RuntimeException(e);
}

}
};

Thread progressChecker = new Thread(checkAndPrintProgress);
progressChecker.setDaemon(true);
progressChecker.start();
break;
...
}
}

public void downloadComplete(DownloadManager manager) {
System.out.println("Download Completed - Exiting.....");
AzureusCore core = AzureusCoreFactory.getSingleton();
try{
core.requestStop();
}catch(AzureusCoreException aze){
System.out.println("Could not end Azureus session gracefully - " +
"forcing exit.....");
core.stop();
}
}
..
}
}

GMF: Beyond the Wizards

by Jeff Richley
07/11/2007
In today's development environment, users expect to be able to visualize data, configuration, and even the processes of a system. For this reason, they use tools to communicate requirements visually with stakeholders and subject matter experts. Think for a moment about UML, it takes a very complex set of data and represents it visually to simplify the communication of software requirements and design. Likewise, there are potential visual tools for describing workflows, data mining, server management, and many other business processes. These tools are able to boost productivity and reduce cost, which is obviously a win-win situation.

Historically, writing these tools has been very time consuming and reserved for those GUI gurus that are well above mere mortals. However, that barrier has been broken down for us by the folks working on the Eclipse Graphical Modeling Framework (GMF).

You may be wondering, "What is GMF and what can it do for me?" GMF is a framework that takes a set of configuration files (a domain model, a graphical definition, and a tool definition), puts them all in a blender, and **poof - magic** out comes a professional looking Eclipse plug-in. Not only does it generate most of the functionality that you have designed, it also gives many freebies such as printing, drag-and-drop, save to image, and customization. Once you have completed the plug-in and all of its handy features, you can then distribute it to your user base for widespread use. There are even features of the Eclipse Plug-in Development Environment (PDE) for creating a distribution site that will help with the nightmare of keeping all of those clients up-to-date.

If you've done any UI programming at all, you realize just how much feature (read: bug) prone coding this eliminates. All of the MVC setup, layout management, property listeners, and the like are generated for you. The mundane, cookie cutter work is generated, which allows you to concentrate on the fun and creative part of your projects.

Tutorials that show you how to get started with GMF jump right into the wizards that are provided as part of the SDK. The wizards and dashboard that are used to develop GMF applications are very powerful. With the exception of your data model, all of the configuration files can be generated from wizards. I am all for wizards, but I tend to go by the motto "Don't generate what you don't understand." Let's take a look under the covers of the wizards, in particular, ecore, gmfgraph, gmftool, and gmfmap.

The domain model, ecore/genmodel files, is the starting place for development of most Eclipse-based applications. The basic development pattern for EMF is to model your domain objects and have EMF generate your entire model code base, including beans and glue code. EMF is not discussed in depth in this article, but resources are listed at the end.

The graphical and tooling definitions are straightforward. The graphical side is a list of figures, described in gmfgraph files, which will be used in the diagram to display classes from the domain model. The gmftool file is a tooling definition that defines what text you want to display on the tool palette and the button's tool tip.

The final step is to tell GMF how all of these pieces work together by creating a gmfmap file. This is the glue that brings the other three configuration files together by telling GMF what action to take when a tool is selected, what classes are to be created, and what figures to render when those classes are added to the diagram. Once everything is wired together, generate a gmfgen file and application code, fire up a test instance of Eclipse, and test out your new application.

Now that we have talked about what GMF applications are and have a general idea of the steps involved in making them, let's take a look at a sample application that models managing a coffee shop. The beginning functionality allows you to add managers and employees, as well as associate a manager to the employees that she is responsible for. This is a fairly handy little tool, but it would be even better if we could add coffee machines to the shop. After all, this is a coffee shop and we need to make hot dirty brown water, right?

Let's fire up Eclipse to see the original plug-in and then we will add a coffee machine into the mix. Once you have added the projects to Eclipse, run the sample application (see Figure 1).


Figure 1. Running an Eclipse plug-in

Create a new coffee shop diagram by selecting File->New->Other->Examples->Coffee Diagram. This will give you a brand new diagram to play around with (Figure 2). Go ahead, add a manager or two, a few employees, and wire the managers with their employees. Once you have created a diagram, save it — in fact, keep it for later when you have wired in the coffee machines.


Figure 2. Sample coffee shop diagram

Now that you have the original set up and working, let's add the ability to create instances of the CoffeeMachine class. The steps for adding a creation action will be:

Define the figure for display
Define the creation tool for the tool palette
Map the creation tool, display figure, and the backing model class
Defining the Figure for Display
Let's first look at creating figures for displaying the CoffeeMachine for your store. Open the coffee.gmfgraph file and poke around to see what is inside (Figure 3). There are four main types of elements in the hierarchy that you need to understand:

Figure Gallery: Shapes for the application
Nodes: Graphical representations of the domain model
Diagram Labels: Labels for the Nodes that give helpful feedback to the user
Connections: Lines denoting relationships between graphical elements

Figure 3. View of the coffee.gmfgraph file

The first step in defining the diagram is to create a figure for the editor to use. Right-click on the Figure Gallery and select New Child->Rectangle (or any other shape that suits your fancy). Select the newly created Rectangle and look at the Properties view. The one line item that must be filled in, at this point, is the Name field. Let me give you a sage word of advice when it comes to naming elements: "Make sure you name your elements so that they are easily identifiable." One mistake that I made was to have vague names that looked very similar to other elements. You will be very happy in the mapping phase if you stay consistent. One naming convention that I typically use is Figure or Diagram. Pick a method that works for you, but once picked, stick with it.

For a good user experience, we would like a figure label to tell what type of model is being displayed. To add a label that shows that the rectangle is actually a coffee machine, right-click on the CoffeeMachineFigure that you just created and select New Child»Label. In the Properties view, give the new Label a name; sticking with the naming convention, it would be something like CoffeeMachineFigureLabel. The Text field denotes what will be displayed on the figure label when it is drawn in the editor. Enter a phrase that would help your user know that it is a coffee machine, such as "<- Coffee Machine ->". Once again, pick a standard way of denoting figures and stick with it; this will go a long way for your users.

In order for GMF to display a model's representation in a diagram, there needs to be a Node to map it to. Create a Node by right-clicking the Canvas and selecting New Child->Nodes Node. This configuration is very straightforward; give it a name and select the figure you want it to use when displaying.

The next step is to make a Diagram Label node. This element will add text to a diagram figure for user feedback. Right-click on the Canvas and select New Child->Labels->Diagram Labels. There are two properties to complete here: Name and Figure. Sticking with our naming conventions, name the new Diagram Label CoffeeMachineDiagramLabel. The Figure is the element from the Figure Gallery to use for display. Select the CoffeeMachineFigureLabel from the drop down list.

There you have it, a finished gmfgraph definition file for adding a CoffeeMachine to a diagram.

Introduction to JavaFX Script

by Anghel Leonard
08/01/2007
What Is JavaFX?
In the spring of 2007 Sun released a new framework called JavaFX. This is a generic name because JavaFX has two major components, Script and Mobile, and, in the future, Sun will develop more components for it.

The core of JavaFX is JavaFX Script, which is a declarative scripting language. It is very different from Java code, but has a high degree of interactivity with Java classes. Many classes of the JavaFX Script are designed for implementing Swing and Java 2D functionalities more easily. With JavaFX Script you can develop GUIs, animations, and cool effects for text and graphics using only a few straightforward lines of code. And, as a plus, you can wrap Java and HTML code into JavaFX Script.

The second component, JavaFX Mobile, is a platform for developing Java applications for portable devices. It will eventually be a great platform for JavaFX Script, but for now is largely irrelevant to the content of this article.

Some Examples of JavaFX Applications
Before we start learning a new language, let's see some examples of JavaFX code. A good resource for examples can be found at the official JavaFX site. To download the examples, please click on JavaFX Script 2D Graphics Tutorial. After the download is complete just double-click the tutorial.jnlp file. In a few seconds you should see something like Figure 1 (if you don't see this image, then you have to configure Java Web Start for the .jnlp extension).


Figure 1. Running the tutorial.jnlp tutorial

Take your time looking over these examples and the source code. There are many interesting effects that can be obtained with just a few JavaFX lines.

If you are still skeptical about the utility of JavaFX, take a look at these two demos; they are partial re-creations of StudioMoto and Tesla Motors sites. You can download them demos from Project OpenJFX by clicking JavaFX Script Studiomoto Demo and JavaFX Script Tesla Demo. They require Java Web Start in order to run, but depending on your machine configuration they may start automatically, or you may have to find and run the downloaded .jnlp file.

Download and Install JavaFX
If you are interested in learning to develop JavaFX applications, then you should know that there are at least three methods for working with JavaFX. Also, it is important to know that JavaFX applications are not browser-based. The simplest and quickest method is based on a lightweight tool called JavaFXPad. The major advantage of using this tool is that you can almost immediately see the effect of the changes you are making in the editor. You can download this tool from Project OpenJFX by clicking JavaFX Script JavaFXPad Demo. Again, running this requires Java Web Start (see Figure 2).


Figure 2. Running the JavaFXPad editor

Another way to work with JavaFX is to use the JavaFX Script Plug-in for NetBeans 5.5 or a JavaFX Script Plug-in for Eclipse 3.2 (of course, before downloading and installing any of these plug-ins you must have NetBeans 5.5 or Eclipse 3.2 already installed).

If you decide to start with the JavaFX plug-in for NetBeans 5.5, the instructions on Project OpenJFX for JavaFX for NetBeans will help you. Similarly, if you want to use the JavaFX plug-in for Eclipse, then go to JavaFX for Eclipse. Notice that all the examples from this article were tested with JavaFX plug-in for NetBeans 5.5, but should work in any of the other listed methods.

Testing the Hello World Application with JavaFX Plug-In for NetBeans 5.5
As always when learning a new language, we have to write the obligatory Hello World application:

Listing 1
import javafx.ui.*;
import java.lang.System;
Frame {
centerOnScreen: true
visible: true
height: 50
width: 350
title: "HelloWorld application..."
background: yellow
onClose: operation() {System.exit(0);}
content: Label {
text: "Hello World"
}
}
To develop and run this simple example in NetBeans 5.5 follow these steps:

Launch NetBeans 5.5.
From the main menu select File -> New Project.
In the New Project window, select the General category and Java Application project (click Next).
In the New Java Application window, type "FXExample" in the Project Name text field.
In the same window use the Browse button to select the location of the project.
Uncheck the "Set as main project" and "Create main class" checkboxes (click Finish).
Right-click on the FXExample -> Source Packages and select New -> File/Folder.
In the New File window select the Other category and the JavaFX File file type (click Next).
In the New JavaFX File window, type "HelloWorld" for File Name and "src" for Folder (click Finish).
Copy the code from Listing 1 and paste it in HelloWorld.fx.
Right-click an FXExample project and select Properties.
In the Project Properties – FXExample, select the Run node from the Categories pane.
In the Arguments text field, type "Hello World" (click OK).
Right-click on FXExample project and select Run Project option.
If everything works, you should see a frame like in Figure 3:


Figure 3. Running the Hello World application in NetBeans 5.5

Now you have the software support for developing and running any JavaFX application.

JavaFX Syntax
Before starting with JavaFX, let's go over some of the fine points of the syntax. If you are already familiar with the syntax of the Java language, most of this will look very familiar, but some of it is quite different.

JavaFX Primitive Types
JavaFX supports four primitive types: String (for java.lang.String), Boolean (for java.lang.Boolean), Number (for java.lang.Number) and Integer (for byte,short,int,long,BigInteger).

JavaFX Variables
A JavaFX variable is declared by using the var keyword. See the following examples:

var x:Number = 0.9;
var name:String = "John";
var y:Integer = 0;
var flag:Boolean = true;

var numbers:Number = [1,2,3,4,5];

What's the Matter with JMatter?

by Eitan Suez
08/21/2007


It has been approximately a year since I wrote my first article on JMatter, first article, and a year is a long time for a successful open source project. Many things have changed and I'd like to give you an update. My last article was an introduction to JMatter and it's time we tackled something more advanced.

Allow me to begin with a very brief, orienting description of JMatter.

JMatter proposes that you, the developer of a small business application, concern yourself primarily with the business logic or the domain in question, for example, say we're developing a solution for a school, perhaps to administer or manage a curriculum. Alternatively, perhaps we're trying to write a system to better manage parts at an automotive shop, or perhaps we're dealing with real estate properties for sale. You get the picture.

JMatter further proposes that you consider most software development tasks that are not directly related to the business domain (such as persistence, writing the user interface, authentication, deployment, and more) as plumbing: it's someone else's job. In fact it's JMatter's job.

Applications developed with JMatter sport user interfaces built on top of the Java Swing toolkit. They are deployed over Java Web Start. For persistence, JMatter leverages Hibernate Core, therefore is compatible with any database system supported by Hibernate.

To give you further insight into the nature of this framework, let's walk through the construction of a non-trivial JMatter application.

Let's Build an App!
The JMatter framework comes with a half dozen demonstration applications that are designed to teach various aspects of the framework.

For this article, let's develop an application that illustrates some of JMatter's object-oriented capabilities. Whether we've attended it or not, many of us are familiar with the JavaOne conference in San Francisco. Let us then develop an application for managing the JavaOne conference. This application somewhat resembles the Sympster demo application that comes with JMatter. A complete application with all use cases is, of course, a little beyond the time and space that we have for this article, so we'll build the foundation for such an application. I'll let you be the judge of the degree of leverage JMatter provides.

Initial Modeling
I happened to have a copy of the brochure for JavaOne 2006 underneath a stack of papers on my desk. After perusing it, I made the following observations:

JavaOne is a conference, an event, where many talks are given. There seem to be a number of different types of events such as Technical Sessions (TS), which are the meat of the conference. Let's not forget Keynote speeches, and the popular Birds of a Feather (BOF) sessions at night.

Both the BOFs and technical sessions have a unique code such as TS-1234 or BOF-2332, while Keynote sessions do not. BOFs and TSs are also categorized by track, and there appear to be five tracks: Java SE, Java EE, Java ME, Tools, and Cool Stuff. All talks have a speaker, a topic, and a description.

Some speakers are distinguished as rock star speakers, some are Java champions, and some are both. Let's call such accolades Speaker Recognitions.

Typically, a distinction is made between the definition of a talk and the scheduling of a specific talk at a specific time and location. This distinction doesn't appear to be necessary for this application.

Finally, talks are scheduled for different rooms. We might want to keep track of the seating capacity for each room, which would be important if we wanted to manage registration for specific talks.

Here, then is a tentative initial model for our application: Talk (with subclasses: Keynote, BOF, and Technical Session), Speaker (and Speaker Recognition), Room, and Track. Let's go ahead and throw in an additional categorization for a talk: a Talk Level (perhaps with three levels: Beginner, Intermediate, and Advanced) to help us ascertain the expertise level expected of attendees.

Creating Our Project
Download JMatter from http://jmatter.org/ and unzip (or untar) the distribution. Assuming you've got Ant installed, from the command line, cd into jmatter/ and issue the following command to create a new project:

ant new-project-ui


Figure 1. GUI for creating new JMatter projects

Give your project a name (JavaOneMgr). You have the choice a create a standalone or dependent project. In standalone projects, all the necessary dependencies are bundled into your project. It doesn't matter too much which you pick here. Dependent projects are simpler to work with if you're making changes to both your project and to the underlying framework.

After creating your project, quit this little app and cd to ../JavaOneMgr, your new project's base directory (feel free to move your new project to another parent directory). The project is already equipped with a build file and base directory structure.

Project Directory Structure and Configuration
The project's directory structure is fairly self-explanatory:

src/: This is where JMatter will expect to find your source code.
test/: Place any JUnit tests you write in this directory.
resources/: This directory contains a variety of application resources. The images/ folder is where you place various image resources: a splash screen and icons representing your model objects that will be used by the JMatter's user interface. hibernate.properties is where you configure your application's database connection (among other Hibernate-related concerns). Some model metadata can be specified in the file model-metadata.properties (more from Chapter 11 of JMatter's documentation); the application's localization resources are also located here.
doc/: Place any documentation specific to your application in this directory.
For standalone projects, you will also find a lib/ folder containing all of your application's dependencies. Dependent projects' build files reference dependencies in your JMatter installation.

You'll be using the generated Ant build file to compile your code, generate your database schema, test run your application, run unit tests, and, when your application is ready, to produce the artifacts necessary to deploy it over Java Web Start.

To configure your project with an IDE, you typically must communicate these pieces of information:

Where your source code is located (specify the src/ folder)
Where to output compiled code (to match the Ant build file, specify build/classes, though we'll typically use the build file for compilation)
Where dependencies are located (for dependent projects, that would be all the jars in jmatter/lib/runtime and the directory jmatter/build/classes)
JMatter requires Java SE version 5 or higher.

We're going to start coding soon, so go ahead and configure your working environment to your tastes.

Schemaless JavaXML Data Binding with VTDXML

Limitations of Schema-based XML Data Binding
XML data binding APIs are a class of XML processing tools that automatically map XML data into custom, strongly typed objects or data structures, relieving XML developers of the drudgery of DOM or SAX parsing. In order for traditional, static XML data binding tools (e.g., JAXB, Castor, and XMLbeans) to work, developers assume the availability the XML schema (or its equivalence) of the document. In the first step, most XML data binders compile XML schemas into a set of class files, which the calling applications then include to perform the corresponding "unmarshalling."
However, developers dealing with XML documents don't always have their schemas on hand. And even when the XML schemas are available, slight changes to them (often due to evolving business requirements) require class files to be generated anew. Also, XML data binding is most effective when processing shallow, regular-shaped XML data. When the underlying structure of XML documents is complex, users still need to manually navigate the typed hierarchical trees, a task which can require significant coding.
Most limitations of XML data binding come from its rigid dependency on XML schema. Unlike many binary data formats, XML is intended primarily as a schemaless data format flexible enough to represent virtually any kind of information. For advanced uses, XML also is extensible: applications may use only the portion of the XML document that they need. Because of XML's extensibility, Web Services, and SOA applications are far less likely to break in the face of changes.
The schemaless nature of XML has subtle performance implications in XML data binding. In many cases, only a small subset in an XML document (as opposed to the whole data set) is necessary to drive the application logic. Yet, the traditional approach indiscriminately converts entire data sets into objects, producing unnecessary memory and processing overhead.
Binding XML with VTD-XML and XPath
Motivation
While the concept of XML data binding has essentially remained unchanged since the early days of XML, the landscape of XML processing has evolved considerably. The primary purpose of XML data binding APIs is to map XML to objects and the presence of XML schemas merely helps lighten the coding effort of XML processing. In other words, if mapping XML to objects is sufficiently simple, you not only don't need schemas, but have strong incentive to avoid them because of all the issues they introduce.
As you probably have guessed by looking at the title of this section, the combination of VTD-XML and XPath is ideally suited to schemaless data binding.
Why XPath and VTD-XML?
There are three main reasons why XPath lends itself to our new approach. First, when properly written, your data binding code only needs proximate knowledge (e.g., topology, tag names, etc.) of the XML tree structure, which you can determine by looking at the XML data. XML schemas are no longer mandatory. Furthermore, XPath allows your application to bind the relevant data items and filter out everything else, avoiding wasteful object creation. Finally, the XPath-based code is easy to understand, simple to write and debug, and generally quite maintainable.
But XPath still needs the parsed tree of XML to work. Superior to both DOM and SAX, VTD-XML offers a long list of features and benefits relevant to data binding, some of which are highlighted in the following list.
High performance, low memory usage, and ease of use: The SAX parser uses a constant amount regardless of document size, but doesn't export the hierarchical structure of XML, which makes it difficult to use. It doesn't even support XPath. The DOM parser builds the in-memory tree, is easier to use, and supports XPath. But it is also very slow and incurs exorbitant memory usage. VTD-XML pushes the XML processing envelope to a whole new level. Like DOM, VTD-XML builds an in-memory tree and is capable of random access. But it consumes only 1/5 the memory of DOM. Performance-wise, VTD-XML not only outperforms DOM by 5x to 12x, but also is typically twice as fast as SAX with null content handler (the max performance). The benchmark comparison can be found here.
Non-blocking XPath implementation: VTD-XML also pioneers incremental, non-blocking XPath evaluation. Unlike traditional XPath engines that return the entire evaluated node set all at once, VTD-XML's AutoPilot-based returns an qualified node as soon as it is evaluated, resulting in unsurpassed performance and flexibility. For further reading, please visit http://www.devx.com/xml/Article/34045.
Native XML indexing: VTD-XML is a native XML indexer that allows your applications to run XPath query without parsing.
Incremental update: VTD-XML is the only XML processing API that allows you to update XML content without touching irrelevant parts of the XML document (See this article on devx.com), improving performance and efficiency from a different angle.
Process Description
The process for our new schemaless XML data binding roughly consists of the following steps.
Observe the XML document and write down the XPath expressions corresponding to the data fields of interest.
Define the class file and member variables to which those data fields are mapped.
Refactor the XPath expressions in step 1 to reduce navigation cost.
Write the XPath-based data binding routine that does the object mapping. XPath 1.0 allows XPath to be evaluated to four data types: string, Boolean, double and node set. The string type can be further converted to additional data types.
If the XML processing requires the ability to both read and write, use VTD-XML's XMLModifier to update XML's content. You may need to record more information to take advantage of VTD-XML's incremental update capability.
A Sample Project
Let me show you how to put this new XML binding in action. This project, written in Java, follows the steps outlined above to create simple data binding routes. The first part of this project creates read-only objects that are not modified by application logic. The second part extracts more information that allows the XML document to be updated incrementally. The last part adds VTD+XML indexing to the mix. The XML document I use in this example looks like the following:

Empire Burlesque
Bob Dylan
USA
Columbia
10.90
1985



Still Got the Blues
Gary More
UK
Virgin Records
10.20
1990


Hide Your Heart
Bonnie Tyler
UK
CBS Records
9.90
1988



Greatest Hits
Dolly Parton
USA
RCA
9.90
1982


Read Only
The application logic is driven by CD record objects between 1982 and 1990 (non-inclusive), corresponding to XPath "/CATALOG/CD[ YEAR <>1982]." The class definition (shown below) contains four fields, corresponding to the title, artist, price, and year of a CD.public class CDRecord {
String title;
String artist;
double price;
int year;
}
The mapping between the object member and its corresponding XPath expression is as follows:
The TITLE field corresponds to "/CATALOG/CD[ YEAR <>1982]/TITLE."
The ARTIST field corresponds to "/CATALOG/CD[ YEAR <>1982]/ARTIST."
The PRICE field corresponds to "/CATALOG/CD[ YEAR <>1982]/PRICE."
The YEAR field corresponds to "/CATALOG/CD[ YEAR <>1982]/YEAR."
The XPath expressions can be further refactored (for efficiency reasons) as following:
Use "/CATALOG/CD[ YEAR <>1982]" to navigate to the CD node.
Use "TITLE" to extract the TITLE field (a string).
Use "ARTIST" to extract the ARTIST field (a string).
Use "PRICE" to extract the PRICE field (a double).
Use "YEAR" to extract the YEAR field (an integer).

Introduction to Amazon S3 with Java and REST

by Eric Heuveneers 11/08/2007
digg_url = 'http://www.onjava.com/pub/a/onjava/2007/11/07/introduction-to-amazon-s3-with-java-and-rest.html';
digg_title = 'Introduction to Amazon S3 with Java and REST';
digg_bodytext = 'S3 is a file storage and serving service offered by Amazon. In this article, Eric Heuveneers demonstrates how to use Amazon S3 via its simple REST API to store and serve your own documents, potentially offloading bandwidth from your own application.';
digg_topic = 'programming';
Introduction
Amazon Simple Store Service (S3) is a service from Amazon that allows you to store files into reliable remote storage for a very competitive price; it is becoming very popular. S3 is used by companies to store photos and videos of their customers, back up their own data, and more. S3 provides both SOAP and REST APIs; this article focuses on using the S3 REST API with the Java programming language.
');
//-->

S3 Basics
S3 handles objects and buckets. An object matches to a stored file. Each object has an identifier, an owner, and permissions. Objects are stored in a bucket. A bucket has a unique name that must be compliant with internet domain naming rules. Once you have an AWS (Amazon Web Services) account, you can create up to 100 buckets associated with that account. An object is addressed by a URL, such as http://s3.amazonaws.com/bucketname/objectid. The object identifier is a filename or filename with relative path (e.g., myalbum/august/photo21.jpg). With this naming scheme, S3 storage can appear as a regular file system with folders and subfolders. Notice that the bucket name can also be the hostname in the URL, so your object could also be addressed by http://bucketname.s3.amazonaws.com/objectid.
S3 REST Security
S3 REST resources are secure. This is important not just for your own purposes, but also because customers are billed depending on how their S3 buckets and objects are used. An AWSSecretKey is assigned to each AWS customer, and this key is identified by an AWSAccessKeyID. The key must be kept secret and will be used to digitally sign REST requests. S3 security features are:
Authentication: Requests include AWSAccessKeyID
Authorization: Access Control List (ACL) could be applied to each resource
Integrity: Requests are digitally signed with AWSSecretKey
Confidentiality: S3 is available through both HTTP and HTTPS
Non repudiation: Requests are time stamped (with integrity, it's a proof of transaction)
The signing algorithm is HMAC/SHA1 (Hashing for Message Authentication with SHA1). Implementing a String signature in Java is done as follows:private javax.crypto.spec.SecretKeySpec signingKey = null;
private javax.crypto.Mac mac = null;
...
// This method converts AWSSecretKey into crypto instance.
public void setKey(String AWSSecretKey) throws Exception
{
mac = Mac.getInstance("HmacSHA1");
byte[] keyBytes = AWSSecretKey.getBytes("UTF8");
signingKey = new SecretKeySpec(keyBytes, "HmacSHA1");
mac.init(signingKey);
}
// This method creates S3 signature for a given String.
public String sign(String data) throws Exception
{
// Signed String must be BASE64 encoded.
byte[] signBytes = mac.doFinal(data.getBytes("UTF8"));
String signature = encodeBase64(signBytes);
return signature;
}
...
Authentication and signature have to be passed into the Authorization HTTP header like this:Authorization: AWS : .
The signature must include the following information:
HTTP method name (PUT, GET, DELETE, etc.)
Content-MD5, if any
Content-Type, if any (e.g., text/plain)
Metadata headers, if any (e.g., "x-amz-acl" for ACL)
GMT timestamp of the request formatted as EEE, dd MMM yyyy HH:mm:ss
URI path such as /mybucket/myobjectid
Here is a sample of successful S3 REST request/response to create "onjava" bucket:Request:
PUT /onjava HTTP/1.1
Content-Length: 0
User-Agent: jClientUpload
Host: s3.amazonaws.com
Date: Sun, 05 Aug 2007 15:33:59 GMT
Authorization: AWS 15B4D3461F177624206A:YFhSWKDg3qDnGbV7JCnkfdz/IHY=
Response:
HTTP/1.1 200 OK
x-amz-id-2: tILPE8NBqoQ2Xn9BaddGf/YlLCSiwrKP+OQOpbi5zazMQ3pC56KQgGk
x-amz-request-id: 676918167DFF7F8C
Date: Sun, 05 Aug 2007 15:30:28 GMT
Location: /onjava
Content-Length: 0
Server: AmazonS3
Notice the delay between request and response timestamp? The request Date has been issued after the response Date. This is because the response date is coming from the Amazon S3 server. If the difference from request to response timestamp is too high then a RequestTimeTooSkewed error is returned. This point is another important feature of S3 security; it isn't possible to roll your clock too far forward or back and make things appear to happen when they didn't.
Note: Thanks to ACL, an AWS user can grant read access to objects for anyone (anonymous). Then signing is not required and objects can be addressed (especially for download) with a browser. It means that S3 can also be used as hosting service to serve HTML pages, images, videos, applets; S3 even allows granting time-limited access to objects.
Creating a Bucket
The code below details the Java implementation of "onjava" S3 bucket creation. It relies on packages java.net for HTTP, java.text for date formatting and java.util for time stamping. All these packages are included in J2SE; no external library is needed to talk to the S3 REST interface. First, it generates the String to sign, then it instantiates the HTTP REST connection with the required headers. Finally, it issues the request to s3.amazonaws.com web server.public void createBucket() throws Exception
{
// S3 timestamp pattern.
String fmt = "EEE, dd MMM yyyy HH:mm:ss ";
SimpleDateFormat df = new SimpleDateFormat(fmt, Locale.US);
df.setTimeZone(TimeZone.getTimeZone("GMT"));
// Data needed for signature
String method = "PUT";
String contentMD5 = "";
String contentType = "";
String date = df.format(new Date()) + "GMT";
String bucket = "/onjava";
// Generate signature
StringBuffer buf = new StringBuffer();
buf.append(method).append("\n");
buf.append(contentMD5).append("\n");
buf.append(contentType).append("\n");
buf.append(date).append("\n");
buf.append(bucket);
String signature = sign(buf.toString());
// Connection to s3.amazonaws.com
HttpURLConnection httpConn = null;
URL url = new URL("http","s3.amazonaws.com",80,bucket);
httpConn = (HttpURLConnection) url.openConnection();
httpConn.setDoInput(true);
httpConn.setDoOutput(true);
httpConn.setUseCaches(false);
httpConn.setDefaultUseCaches(false);
httpConn.setAllowUserInteraction(true);
httpConn.setRequestMethod(method);
httpConn.setRequestProperty("Date", date);
httpConn.setRequestProperty("Content-Length", "0");
String AWSAuth = "AWS " + keyId + ":" + signature;
httpConn.setRequestProperty("Authorization", AWSAuth);
// Send the HTTP PUT request.
int statusCode = httpConn.getResponseCode();
if ((statusCode/100) != 2)
{
// Deal with S3 error stream.
InputStream in = httpConn.getErrorStream();
String errorStr = getS3ErrorCode(in);
...
}
}
Dealing with REST Errors
Basically, all HTTP 2xx response status codes are success and others 3xx, 4xx, 5xx report some kind of error. Details of error message are available in the HTTP response body as an XML document. REST error responses are defined in S3 developer guide. For instance, an attempt to create a bucket that already exists will return:HTTP/1.1 409 Conflict
x-amz-request-id: 64202856E5A76A9D
x-amz-id-2: cUKZpqUBR/RuwDVq+3vsO9mMNvdvlh+Xt1dEaW5MJZiL
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Sun, 05 Aug 2007 15:57:11 GMT
Server: AmazonS3


BucketAlreadyExists
The named bucket you tried to create already exists
64202856E5A76A9D
awsdownloads
cUKZpqUBR/RuwDVq+3vsO9mMNvdvlh+Xt1dEaW5MJZiL

Code is the interesting value in the XML document. Generally, this can be displayed as an error message to the end user. It can be extracted by parsing the XML stream with SAXParserFactory, SAXParser and DefaultHandler classes from org.xml.sax and javax.xml.parsers packages. Basically, you instantiate a SAX parser, then implement the S3ErrorHandler that will filter for Code tag when notified by the SAX parser. Finally, return the S3 error code as String:public String getS3ErrorCode(InputStream doc) throws Exception
{
String code = null;
SAXParserFactory parserfactory = SAXParserFactory.newInstance();
parserfactory.setNamespaceAware(false);
parserfactory.setValidating(false);
SAXParser xmlparser = parserfactory.newSAXParser();
S3ErrorHandler handler = new S3ErrorHandler();
xmlparser.parse(doc, handler);
code = handler.getErrorCode();
return code;
}
// This inner class implements a SAX handler.
class S3ErrorHandler extends DefaultHandler
{
private StringBuffer code = new StringBuffer();
private boolean append = false;
public void startElement(String uri, String ln, String qn, Attributes atts)
{
if (qn.equalsIgnoreCase("Code")) append = true;
}
public void endElement(String url, String ln, String qn)
{
if (qn.equalsIgnoreCase("Code")) append = false;
}
public void characters(char[] ch, int s, int length)
{
if (append) code.append(new String(ch, s, length));
}
public String getErrorCode()
{
return code.toString();
}
}
A list of all error codes is provided in S3 developer guide. You're now able to create a bucket on Amazon S3 and deal with errors. Full source code is available in resources section.
File Uploading
Upload and download operations require more attention—S3 storage is unlimited, but it allows 5 GB transfer maximum per object. An optional content MD5 check is supported to make sure that transfer has not been corrupted, although an MD5 computation on a 5 GB file will take some time even on fast hardware.
S3 stores the uploaded object only if the transfer is successfully completed. If a network issue occurs then file has to be to uploaded again from the start. S3 doesn't support resuming or object content partial update. That's one of the limits of the first "S" (Simple) in S3, but the simplicity also makes dealing with the API much easier.
When performing a file transfer with S3, you will be responsible for streaming the objects. A good implementation will always stream objects, as otherwise they will grow in Java's heap; with S3's limit of 5 GB on an object, you could quickly be seeing an OutOfMemoryException.
An example of a good upload implementation is available in the resources section of this article.
Beyond This Example
Many other operations are available through the S3 APIs:
List buckets and objects
Delete buckets and objects
Upload and download objects
Add meta-data to objects
Apply permissions
Monitor traffic and get statistics (still a beta API)
Adding custom meta-data to an object is an interesting feature. For example, when uploading a video file, you could add "author," "title," and "location" properties, and retrieve them later when listing the objects. Getting statistics (IP address, referrer, bytes transferred, time to process, etc.) on buckets could be useful too to monitor traffic.
Conclusion
This article introduced the basics of Amazon Simple Store Service REST API. It detailed how to implement bucket creation in Java and how to deal with S3 security principles. It showed that HTTP and XML skills are needed when developing with S3 REST API. Some S3 operations could be improved (especially for upload), but overall Amazon S3 rocks. To go beyond what was presented in this article, you could check Java S3 tools available in the resources section.
References and Resources
Source code: Source code for this article
SOAP: Simple Object Access Protocol
REST: REpresentational State Transfer
S3 APIs: Amazon S3 Developer Guide
HMAC: Keyed-Hashing for Message Authentication (RFC 2104)
S3 forum: S3 forum for developers
S3 upload applet: A Java applet to upload files and folders to S3
Java S3 toolkit: An S3 toolkit for J2SE and J2ME provided by Amazon
Jets3t: Another Java toolkit for S3
Eric Heuveneers is a software developer and an IT consultant with more than eight years of experience. His main skills are in Java/JEE and open source solutions.

Using XML and Jar Utility API to Build a Rule-Based Java EE Auto-Deployer

by Colin (Chun) Lu 11/16/2007
Introduction
Today's Java EE application deployment is a common task, but not an easy job. If you have ever been involved in deploying a Java EE application to a large enterprise environment, no doubt you have faced a number of challenges before you click the deploy button. For instance, you have to figure out how to configure JMS, data sources, database schemas, data migrations, third-party products like Documentum for web publishing, dependencies between components and their deployment order, and so on. Although most of today's application servers support application deployment through their administrative interfaces, the deployment task is still far from being a one-button action.
');
//-->

In the first few sections of this article, I will discuss some of the challenges of Java EE deployment. Then I will introduce an intelligent rule-based auto-deployer application, and explain how it can significantly reduce the complexity of Java EE system deployment. I will also give a comprehensive example on how to build XML rules using XStream utility library, how to extend and analyze the standard Java EE packaging (EAR), and perform a complex deployment task just by pushing one button.
Challenge 1: Package Limitations
A Java EE application is packaged as an enterprise application archive file (EAR). Java EE specification defines the format of an EAR file as depicted in Figure 1.
Figure 1. Standard Java EE EAR file structure
A standard EAR file meets the basic requirements for packaging an application as most of web-based JAVA EE applications are composed solely of web and/or EJB applications. However, it lacks the capability of packaging advanced JAVA EE application modules. For example, the following modules are often used in a JAVA EE application deployment, but cannot be declared in a standard EAR file:
JDBC Connection Pool and DataSource objects
JMS ConnectionFactory and Destination objects
JMX MBeans
SQL statements
Other resource files
Most of Java EE applications require Data sources, schema changes, data migrations, and JMS configurations. Today, these components have to be manually configured and deployed via an administration interface provided by the implementation vendor. This is typically the responsibility of the system administrator.
Challenge 2: Deployment Order and Dependencies
Another challenge to an application deployer is that he has to know the deployment dependencies and follow the exact order to deploy multiple deployments for one application.
A large Java EE application may have complex dependencies on other deployments. For example, corresponding database tables must be created before an application can be deployed; a JDBC data source must be configured ahead of a JMS server. In these situations, the deployer first has to coordinate with the application architect and developers to find out the deployment requirements and dependencies, and then make a detailed deployment plan. This process is not very efficient; we need a better solution.
Solution
How can we help a deployer survive these challenges? Is there a way to simplify this complex deployment process? A possible solution is to use the vendor proprietary capability to extend your EAR to be more intelligent. For example, WebLogic Server supports packaging JDBC and JMS modules into an EAR file, and the WebLogic Deployer can deploy your application as well as application-scope JDBC and JMS modules in one action. Isn't that useful? Wait a second, there are still limitations:
Tightly coupled - By doing this, your EAR is dependent on one vendor's application server. Your application has to be packaged according to the vendor's specification. This means if you want your product to be deployed across different application servers (or if your production wants to support multiple application servers), you have to maintain multiple EARs.
Complicated packaging - Since you have to follow specifications of different vendors, the application packaging is going to be very complicated and hard to understand.
Hard to maintain - For one application, you need to maintain different versions of EARs for different application servers, or even different versions of the same applications server.
Not a true one-button deployment. Since this type of deployment leverages vendor-specific tools, it cannot support deployment tasks that are not supported by application server. For example, one application may need to execute SQL statements to build schemas and load reference data or upload configurations to LDAP server to expose its service endpoints.
A practical solution is to make an intelligent XML rule-based auto-deployer by extending the Java EE packaging.
A Rule-Based Auto-Deployer
This solution has three main parts:
Tool: deployment XML rule generator using XStream
Packaging: extend EAR packaging to include the rule XML document using Ant
Deployer: EAR analyzer and auto-deployer using Java's Jar Utility API
The suggested deployment work flow is illustrated in Figure 2.
Figure 2. Deployment work flow
Case Study
Let's think about the deployment of a Service Order Processing application to a WebLogic server. Here are the deployment tasks that need to be done:
Configure a JDBC connection pool and data source for manipulating the order processing data.
Execute SQL statements for database objects creation (tables, triggers, and reference data, etc.).
Configure a JMS queue to dispatch service order requests.
Upload system properties (e.g., the URL and JNDI name of the JMS queue for order processing) to a LDAP server.
Finally, deploy this application to an application server
1. Deployment Tool: XML Rule Generator using XStream
The first step is to generate an XML rule from a plan by the application assembler.
Step 1: Define a deployment plan
To define a deployment plan, the application assembler discusses the deployment requirements with developers and architects. For the sample service order processing system, a deployment plan is defined below:DataSource,t3://localhost:7001,NONXA,jdbc/testDS,colin,password,jdbc:oracle:thin:@localhost:1521:localdb,oracle.jdbc.driver.OracleDriver
SQL,t3://localhost:7001,jdbc/testDS,sql/testDS.sql
JMS,t3://localhost:7001,PTP,testJmsServer,testJmsRes,jmsTestConnFactory,jms/conn,testQueue,jms/testQueue
LDAP,ldapread-server.com,489,cn=one_button_deployment,o=system_configuration,ldif/test.ldif
APPLICATION,t3://localhost:7001,SOManager,Release v1.0
Step 2: Use the Deployment Tool to generate an XML document from the plan
After the plan is defined, the application assembler runs the deployment tool application to feed in the plan and generate the XML rule document.
The sample application is shown Figure 3.
Figure 3. Sample deployment tool