Java Configuration Design Update

From Funtoo
Revision as of 01:17, July 2, 2014 by Drobbins (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page describes a potential update to the Java support code in Gentoo and Funtoo Linux, with the intention of simplifying java-config code and also making it more correct and better integrated with Portage itself. The proposal intends to address deficiencies in the current design of java-utils-2.eclass while maintaining its useful features and avoiding a full rewrite. Rather than replace the current system with more code, the goal is to simplify the existing code while retaining functionality, while allowing a graceful migration of advanced functionality inside Portage itself.

   Tip

Please post comments on the Discussion page.

Design Challenges

The current Java eclasses have some very interesting and powerful features, yet are lacking in some areas.

Dependency Handling

Currently, java-utils-2.eclass uses depend-java-query to automatically select the most optimal Java VM for building. This is a sophisticated feature that is intended to be 'smart', yet it has some flaws. These flaws are fixable:

  • While JVM auto-selection is smart, it is not always correct -- it doesn't use Portage's API, but instead uses regexes to parse the currently-executing ebuild's DEPEND string.
  • Related to the regex issue, the code in depend-java-query is complex as it tries to duplicate some functionality that could be handled better by Portage's API.

This proposal provides a mechanism for the Portage API to be used to always generate a completely correct result. This may seem like a minimal optimization since the existing code is correct most of the time. But it is duplicative, in that it does a rough approximation of what the Portage API could do absolutely correctly. So there is no good reason to keep it -- less code means more maintainability.

   Important

Moreover, there is even a better reason for doing things the right way -- eclasses should as much as possible be able to behave as "first-class citizens" in the world of Portage. This is an important architectural goal, and there are severe costs for not pursuing it, namely having every eclass under the sun re-implement various parts of Portage, albeit poorly, creating much uncessary code and frustration!

Atoms

Currently, eselect java-vm and java-config have their own "atoms" that they use to identify Java virtual machines, which seems to be ${PN}-${SLOT}. Yet Portage already has atoms, and we should try to allow support for them over time. This proposal does require there to be some "link" between the existing JVM atom and the Portage atom. This link currently does not exist.

Proposal

java-config Settings

This proposal involves a minor upgrade to the java-config settings for a JVM -- adding a PORTAGE_ATOM setting to the file, which allows the java-config atom to be linked to the Portage atom:

   /usr/share/java-config-2/vm/oracle-jdk-bin-1.7 (bash source code) - Java-config settings
# Copyright 1999-2011 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: /var/cvsroot/gentoo-x86/dev-java/oracle-jdk-bin/files/oracle-jdk-bin-1.7.env,v 1.2 2011/11/17 22:49:56 caster Exp $

VERSION="Oracle JDK 1.7.0.60"
JAVA_HOME="/opt/oracle-jdk-bin-1.7.0.60"
JDK_HOME="/opt/oracle-jdk-bin-1.7.0.60"
JAVAC=${JAVA_HOME}/bin/javac
PATH="${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin"
ROOTPATH="${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin"
LDPATH="${JAVA_HOME}/jre/lib/amd64/:${JAVA_HOME}/jre/lib/amd64/xawt/:${JAVA_HOME}/jre/lib/amd64/server/"
MANPATH="/opt/oracle-jdk-bin-1.7.0.60/man"
PROVIDES_TYPE="JDK JRE"
PROVIDES_VERSION="1.7"
BOOTCLASSPATH="${JAVA_HOME}/jre/lib/resources.jar:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/jre/lib/sunrsasign.jar:${JAVA_HOME}/jre/lib/jsse.jar:${JAVA_HOME}/jre/lib/jce.jar:${JAVA_HOME}/jre/lib/charsets.jar:${JAVA_HOME}/jre/classes"
GENERATION="2"
ENV_VARS="JAVA_HOME JDK_HOME JAVAC PATH ROOTPATH LDPATH MANPATH"
VMHANDLE="oracle-jdk-bin-1.7"
BUILD_ONLY="FALSE"
PORTAGE_ATOM="dev-java/oracle-jdk-bin-1.7.0.60"

With this very minor and easy-to-implement change, new possibilities are available -- namely, for java-config code to tap into the Portage API and leverage its advanced dependency handling functionality.

Provides

   Note

This section is not a requirement for the implementation of this proposal, but a suggestion for future Portage development.

Currently, the functionality that is provided by each JVM is stored in the java-config file shown above, namely in the PROVIDES_TYPE and PROVIDES_VERSION variables. This works for us, though it is unfortunate that the (in my opinion) not completely thought-out GLEP 37 was implemented, which deprecated the PROVIDES variable in Portage. This is a useful variable because it gave us a "link" from the installed package to the virtual it provided. No such link currently exists, which is a reduction in useful functionality.

It is recommended that PROVIDES is un-deprecated in Portage, and used by ebuilds solely for recording what virtuals are provided by the ebuild, so that they can be queried later once the package is installed. This would allow PROVIDES_TYPE and PROVIDES_VERSION -- functionality duplicated by the Java tools -- to be removed, further reducing the complexity of the Java tools codebase.

java-config Dependency Handling

With these changes, java-config and depend-java-query to tap directly into the power of Portage dependency handling. Let's see how.

Query: List installed VMs

To list installed VMs, java-config could look in /usr/share/java-config-2/vm/ to get a list of all installed Java VMs. However, thanks to the PORTAGE_ATOM variable, it can then compile a list of corresponding package atoms in /var/db/pkg.

Query: Is a Suitable or Best VM Selected?

Currently, the java-utils-2.eclass attempts to magically select the proper VM for building based on the contents of the DEPEND string. This is how this functionality would be implemented correctly, using the Portage API rather than custom code.

To perform this query correctly and efficiently, depend-java-query can compile a list of Portage atoms for all installed Java VMs, along with what virtual they provide using PORTAGE_ATOM, PROVIDES_TYPE and PROVIDES_VERSION. Then, the DEPEND handed to it can be correctly evaluated using Portage functions. This is how the Portage side of the algorithm would work:

  1. Pre-process DEPEND string:
    1. Remove all components that are not related to virtual/jdk and virtual/jre.
    2. Evaluate DEPEND string using currently-active USE settings to remove all conditionals.
  2. Create a fakedbapi and cpv_inject() the current system VM Portage atom into it.
  3. Evaluate the pre-processed DEPEND string and see if the dependencies are satisfied.
  4. If so, a suitable VM is selected.

To find the "best match" VM, a similar process would be followed. Instead of cpv_injecting the current system VM, all installed VMs would be injected and a best match would be found.

Future Directions

Reusing Code, Merging into Portage

Since this functionality could use useful to other eclasses, this could be used as a test for a more generic helper function that could be integrated into Portage. This would allow other eclasses to implement similar functionality without having to have their own custom helper applications.

The java-config settings file could be merged into /var/db/pkg and a simple API could be added to vartree and dblink to provide API access to it. This would provide a toolkit for eclasses for storing extra configuration settings with installed ebuilds, which could be useful for a variety of purposes.

All these goals support the idea of code re-use and maintainability, and addressing the problem at the correct architectural level.