When writing an annotation-based framework, you many times have the need to find all classes that use a specific annotation at deployment time to initialize your framework. For example, the EJB 3.0 deployer needs to know which classes are annotated with @Stateless, @Stateful, and @MessageDriven, so it can create a container for each of those classes. For JPA, it needs to find a given set of classes within an archive that are annotated with @Entity so that it can define its ORM mapping. This scanning for annotations can be done at runtime using various techniques and open source libraries. I want to discuss how to do this in my blog and point you to a small project I created at sourceforge to help out with this.

Finding archives to scan

The first thing you need to do is to find actual archives you want to scan for classes. Depending on your environment there are different ways to accomplish this.

Java classpath

The “java.class.path” system property, if set properly (FYI, maven doesn’t set it properly), can be a way to obtain a list of classpath/archives you can scan for annotated classes. Although it it only provides relative paths, you can easily turn these into URLs and/or InputStreams for iterating over.

Using your classloader

Using ClassLoader.getResource() and ClassLoader.getResources() is a way to obtain URLs to specific resources in your classpath. From these specific resources you can extract the base url of the classpath/archive the resource resides in by chopping off the base resource name. For example, one common technique is to have a marker file in each of your archives. For JPA, this marker file is META-INF/persistence.xml. You could then to ClassLoader.getResources() to find all classpaths/archives that contain that file:

ClassLoader cl = Thread.currentThread().getContextClassLoader();
Enumeration<URL> urls = cl.getResources("META-INF/persistence.xml");

Calculating the URL of the classpath/archive is just a matter of a little string manipulation of the url.

Web Applications

In web applications its very easy to obtain URLs that point to jars within WEB-INF/lib or to extract where the WEB-INF/classes path is by using the ServletContext class. The class has a nice method getResourcePaths() that returns URLs to what you want. For example:

List<URL> urls = new ArrayList<URL>();

Set libJars = servletContext.getResourcePaths("/WEB-INF/lib");
for (Object jar : libJars)
{
    try
    {
       urls.add(servletContext.getResource((String) jar));
    }
    catch (MalformedURLException e)
    {
       throw new RuntimeException(e);
    }
}

ServletContext.getResourcePaths() returns a directory like list of all paths under the specified base path you provide as a parameter. Passing in “/WEB-INF/lib” will get you a listing of all .jar files within /WEB-INF/lib of your web application. Once you have this listing, you can do ServletContext.getResource() to obtain a URL pointing to each of the .jar files.

To find the /WEB-INF/classes is a little trickier.

URL classesPath = null;
Set libJars = servletContext.getResourcePaths("/WEB-INF/classes");
for (Object jar : libJars)
{
   try
   {
      URL url = servletContext.getResource((String) jar);
      String urlString = url.toString();
      int index = urlString.lastIndexOf("/WEB-INF/classes/");
      urlString = urlString.substring(0, index + "/WEB-INF/classes/".length());
      classesPath = new URL(urlString);       break;    }
   catch (MalformedURLException e)
   {
      throw new RuntimeException(e);
   }
}

You use the same ServletContext.getResourcePaths() and ServletContext.getResource() but you must do some string manipulation to the URL to get the base path. If you are deploying things within a web framework, you can write a ServletContextListener that obtains access to the servlet context.

Browsing archives

Once you have URLs pointing to directories or .jar files that make up your classpath (or the set of archives/paths you want to scan) you need to browse them. You can usually assume that URLs ending in “/” are some form of a directory while those not ending in “/” are .jar files. If you hack and step through URLClassLoader code, you’ll find that it makes the same assumption. Browsing jars is very easy. If you have a URL pointing to a .jar file all you need to do is open an InputStream to it and instantiate a JarInputStream. This class has methods that allow you to list and open files within the jar archive.

URLs that point to a directory poses an abstraction problem. Browsing a directory structure is protocol specific. If the URL protocol is “file:” you can just use java.io.File. If its “http:”, you need to use a WebDAV library. If its a different protocol, you need to find a library that allows you to browse that particular protocol.

Finding annotated classes

So, once you have access to your classpaths you can obtain a list of .class files within those classpaths. It generally is a very bad idea to load each and every one of those classes using your ClassLoader and use the Java reflection API to scan for annotations you are interested in.

  1. Use the Java reflection API only allows you to see annotations that are visible at runtime. If you remember your JDK 5.0 lessons, there are 3 types of annotation. Source, Class, and Runtime. Class and Runtime are compiled into your .class files, but only Runtime are visible at runtime.
  2. You generally do not use each and every class in the libraries you are scanning. If you load each class into your ClassLoader, you are filling up the JVM’s perm-gen space and wasting resources.

So, if you’re not going to load the class through your classloader, how do you scan for annotations? The answer is to use a bytecode processing library like Javassist or ASM. I am most familiar with Javassist since I wrote the annotation processing for it so let’s show examples using that library. ASM is perfectly good as well for this purpose.

With Javassist, you load up a ClassFile instance from the InputStreams you browse from your .jar or classpath. This object allows you to view the bytecode structure of your .class file without loading the class and find the string names of the annotations attached to each element of the class

DataInputStream dstream = new DataInputStream(new BufferedInputStream(bits));

ClassFile cf =  new ClassFile(dstream);
String className = cf.getName();
AnnotationsAttribute visible = (AnnotationsAttribute) cf.getAttribute(AnnotationsAttribute.visibleTag);
AnnotationsAttribute invisible = (AnnotationsAttribute) cf.getAttribute(AnnotationsAttribute.invisibleTag);
for (javassist.bytecode.Annotation ann : visible.getAnnotations())
{
     System.out.println("@" + ann.getTypeName());
}

The visible and invisible attributes correspond to Runtime and Class visible annotations. From this information you can obtain annotations attached to the class. Javassist has a reflection-like api that allows you to iterate over methods and fields of the ClassFile. Getting annotation information for methods and fields is exactly the same as getting them from the class.

New Scannotation Framework

I’m writing this blog because I actually had to do a lot of this scanning for JBoss’s EJB container and recently the JAX-RS implementation I’m working on. Because I thought this code might be useful, I created a sourceforge project for it called Scannotation. The Scannotation framework encapsulates all the ideas and functionality I talked about in this blog. It centers around three classes: ClasspathUrlFinder, WarUrlFinder, and AnnotationDB.

ClasspathUrlFinder finds classpath URLs for you. It has methods to obtain them from the java.class.path System property. Other methods to find an archive URL by providing a classloader resource name as described earlier in this blog. WarUrlFinder encapsulates finding URLs for your WEB-INF/classes directory as well as the jars in WEB-INF/lib. Its pointless to repeat the javadocs for these classes so just follow the links above to find out more information.

The AnnotationDB class consumes URLs you find through ClasspathUrlFinder or WarUrlFinder. It scans them for .class files and uses Javassist to make two indexes:

  • An index keyed on the fully qualified name of an annotation, with a set of classes that use that annotation
  • An index of fully qualified class names with a set of annotations that that particular class uses

Here’s an example of scanning your Classpath:

URL[] urls = ClasspathUrlFinder.findClassPaths(); // scan java.class.path
AnnotationDB db = new AnnotationDB(); db.scanArchives(urls);

Here’s another example of scanning all JPA archives:

URL[] urls = ClasspathUrlFinder.findResourceBases("META-INF/persistence.xml");
AnnotationDB db = new AnnotationDB();
db.scanArchives(urls);
Set<String> entityClasses = db.getAnnotationIndex().get(javax.persistence.Entity.class.getName());

From this mini annotation database, your annotation frameworks can pick out which classes the care about more easily. Eventually I want to write ant task and maven plugin that will add an annotation index into a file within META-INF of your jars. That way you could precompile this index at build time and save some CPU cycles. If anybody is interested in doing this, let me know and I’ll give you SVN access to the project.

EDITED 3/31/2009:

You might want to check out the Reflections project.  I think they have taken scannotations to the next level.  I personally have not been able to maintain the scannotations project.