Monday, August 12, 2013

Getting your stuff - using the RTC SDK to zip a repository workspace

Have you ever wanted to create a zip archive of a repository workspace? This post describes how to use the RTC SDK (and some undocumented internals) to copy the contents of a repository workspaces into a zip file.

Let's start with some background. A repository workspace contains components, each of which has a configuration - which is a ten dollar word for "file tree." The configuration provides access to the structure of the tree and has pointers to the file content.

A program that zips up a remote workspace needs to:
  1. log into the repository,
  2. find the workspace to zip,
  3. get the components in the workspace,
  4. get the configuration of each component, and finally
  5. walk the file tree to write each directory/file to our zip.
This post uses unsupported API. It is likely that the APIs will change without warning in future iterations of RTC - you should use the Source Control command line tool instead. You are using these APIs at your own risk.
Each step is addressed in its own section. The code samples are available on JazzHub or can be downloaded as a zip. You are expected to have a working development environment with the RTC SDK and RTC server configured properly.

Logging in to the repository

All of the information we're interested in is stored on the RTC repository. In order to access it, our programs needs to log in:
ITeamRepository repo = TeamPlatform.getTeamRepositoryService().getTeamRepository(uri);
  
repo.registerLoginHandler(new MyLoginHandler(username, pw));

repo.login(null);
The MyLoginHandler class included in the sample project. The username and
pw are configured beforehand.

Finding the workspace

For our example, we find the workspace by searching for one with a specific name using the IWorkspaceManager#findWorkspaces() method.

The interesting classes here are the SCMPlatform singleton that allows us to get a IWorkspaceManager, and the IWorkspaceSearchCriteria. The workspace manager answers simple queries about workspaces, as well as providing access to IWorkspaceConnections, that we will learn more about below.

IWorkspaceManager mgr = SCMPlatform.getWorkspaceManager(repo);


// Find the named workspace
IWorkspaceSearchCriteria cri = IWorkspaceSearchCriteria.FACTORY.newInstance();
cri.getFilterByOwnerOptional().add(repo.loggedInContributor());
cri.setExactName(wsName);
List<IWorkspaceHandle> findWorkspaces = mgr.findWorkspaces(cri, 2, null);

if (findWorkspaces.size() == 0) {
 System.err.println("Couldn't find any workspaces named \"" + wsName + "\"");
 return false;
}

if (findWorkspaces.size() > 1) {
 System.err.println("Multiple workspaces named \"" + wsName + "\"");
 return false;
}
The IWorkspaceSearchCriteria allows us to build a query that matches a number of workspaces. The query fields are implicitly and'ed together to limit the workspaces returned. In an ideal world, we'll find exactly one workspace that matches our criteria.

Searching by workspace name is useful for our example, but it isn't the safest thing to do in a production environment, since there could be multiple workspaces with the same name. In production UI code, we would present the user with a choice of workspaces. If the code were headless, we would use the ID of the workspace we care about.

Because we want a modicum of correctness, our example ensures that there is exactly one visible workspace with the given name (lines 68-76).

The findWorkspaces() method returns a list of handles to the workspaces matching the criteria. A handle is a lightweight object that identifies an item in the repository. We'll use the handle later on to query the workspaces.

Getting the components in the workspace


Once we've found the IWorkspaceHandle, we need a richer representation to query the file tree. We do that by converting the handle into an IWorkspaceConnection (see line 78, below). The IWorkspaceConnection provides operations on a repository workspace, and caches information about the workspace.

Now we have the connection, we start digging into the structure of the workspace. The topmost layer in the logical tree consists of components. Components split remote workspaces into logical groupings of files and folders. They aren't normally represented in the local filesystem when loaded, so our zip creator won't record them in the zip.

IWorkspaceConnection wsConn = mgr.getWorkspaceConnection(findWorkspaces.get(0), null);

// Start walking the workspace contents
IFileContentManager contentManager = FileSystemCore.getContentManager(repo);

File base = new File(System.getProperty("user.dir"));

FileOutputStream out = new FileOutputStream(new File(base, wsName + ".zip"));
try {
 ZipOutputStream zos = new ZipOutputStream(out);
 
 for (IComponentHandle compHandle : (List<IComponentHandle>)wsConn.getComponents()) {
  IConfiguration compConfig = wsConn.configuration(compHandle);

  // Fetch the items at the root of each component. We do this to initialize our 
  // queue of stuff to download.
  Map<String, IVersionableHandle> handles = compConfig.childEntriesForRoot(null);
  List<IVersionable> items = compConfig.fetchCompleteItems(new ArrayList<IVersionableHandle>(handles.values()), null);

  loadDirectory(contentManager, compConfig, zos, "", items);
 }
 
 zos.close();

} finally {
 out.close();
}
On line 89 we loop over each of the components, getting an IConfiguration. The configuration encapsulates the file/folder structure in the repository workspace, so we use that to walk the remote filesystem. The first part of the walk is on line 94 where we get the handles of the component's root items. (Note that we're dealing with root items: there could be files and symlinks at the top of the component hierarchy, as well as directories)

Walking the file tree to fetch content


We're finally here! The fun part that involves getting file content from the repository. Unfortunately, this is also where we diverge from the supported API. In our last code snippet, you'll notice that we got an IFileContentManager on line 81. Sadly, that isn't part of the API anyone outside of the SCM Core is supposed to use. If you do use it, be aware that the class could change in future releases. You use it at your own risk.

With the legalese/honesty out of the way, let's look at how we use the forbidden API. Our loadDirectory method is called recursively to write the content of each directory into the zip file. It takes a list of IVersionable items as an argument. An IVersionable is the superclass of the things that live in a filesystem: files, folders, or symlinks. It is possible that other types of items could exist in the configuration - but let's pretend they don't, because, for the most part, they won't.

loadDirectory() loops over each versionable and either creates a directory in the zip or writes the file content into the zip. In the case of folders, it gets the children from the configuration (line 135), and then recursively calls itself:

if (v instanceof IFolder) {
 // Write the directory
 String dirPath = path + v.getName() + "/";
 zos.putNextEntry(new ZipEntry(dirPath));
 
 @SuppressWarnings("unchecked")
 Map<String, IVersionableHandle> children = compConfig.childEntries((IFolderHandle)v, null);
 @SuppressWarnings("unchecked")
 List<IVersionable> completeChildren = compConfig.fetchCompleteItems(new ArrayList<IVersionableHandle>(children.values()), null);

 loadDirectory(contentManager, compConfig, zos, dirPath, completeChildren);
}

More interesting things happen in the IFileItem block:

else if (v instanceof IFileItem) {
 // Get the file contents and write them into the directory
 IFileItem file = (IFileItem) v;
 zos.putNextEntry(new ZipEntry(path + v.getName()));
 
 InputStream in = contentManager.retrieveContentStream(file, file.getContent(), null);
 byte[] arr = new byte[1024];
 int w;
 while (-1 != (w = in.read(arr))) {
  zos.write(arr, 0, w);
 }
 
 zos.closeEntry();
}

The fun part is on line 146 where we ask the (forbidden) IFileContentManager for the content of the file. We pass in the IFileItem as well as its IContent, which is a pointer to the blob of bytes stored in the RTC repository. The remainder of the block is anticlimactic: copying the content with a regular Java stream idiom.

Even though the file content portion of this example is forbidden API, the example helps to show how to use our APIs. The pattern of logging in, finding a workspace, and then using the IWorkspaceConnection to perform some operation may be useful in other contexts. The example doesn't begin to get into the practical complexities of getting content (normalizing line endings, storing file properties, handling symlinks), or the problems faced when merging into an existing filesystem.

You can download the full Eclipse project or poke at the project on JazzHub

6 comments:

  1. This is very useful and helped me a lot working on the reverse direction (from zip to SCM).

    ReplyDelete
  2. The reverse direction is now described here: http://rsjazz.wordpress.com/2013/10/15/extracting-an-archive-into-jazz-scm-using-the-plain-java-client-libraries/

    ReplyDelete
  3. Great stuff, thank you! BTW - the link to the Eclipse project isn't working.

    ReplyDelete
  4. can you pls provide the source codes again? the link above doesn't work

    ReplyDelete