Zum Inhalt springen →

JavaDoc Dokumentation mit einem Large Language Model erstellen

Die Erstellung von JavaDoc-Dokumentation ist ein oft vernachlässigter Bestandteil der Softwareentwicklung. Eine gute JavaDoc Dokumentation ermöglicht eine bessere Zusammenarbeit im Team, eine klarere und verständlichere Codebasis und kann auch bei der Realisierung eines KI basierten Code-Assistenten helfen (dazu aber in einem anderen Beitrag mehr). Hier werde ich mich damit befassen, wie ein Large Language Model (LLM) bei der Erstellung von JavaDoc-Dokumentation behilflich sein kann. Dabei werde ich verschiedene Modelle miteinander verglichen, um ihre Vor- und Nachteile zu beleuchten.

Was ist JavaDoc?

JavaDoc ist ein Tool, das Teil des Java Development Kits (JDK) ist. Es ermöglicht Entwicklern, automatisch eine Dokumentation aus dem Java-Quellcode zu generieren. Durch die Hinzufügung von speziellen Kommentaren (tags) im Code können Entwickler die Struktur ihrer Software dokumentieren, einschließlich Klassen, Methoden, Variablen und mehr.

https://de.wikipedia.org/wiki/Javadoc

Die Rolle von Large Language Models

Large Language Models können bei der Erstellung von JavaDoc-Dokumentation eine große Hilfe sein. Die Large Language Modelle, die auf umfangreichen Datenmengen trainiert sind, verfügen auch über ein brauchbares Verständnis von Programmiersprachen und können Java Quellcode verstehen und JavaDoc Beschreibungen generieren.

Auswahl eines Geeigneten Prompts zur JavaDoc-Dokumentation

Die Auswahl eines geeigneten Prompts ist ein entscheidender Schritt, um ein Large Language Model (LLM) effektiv für die Erstellung von JavaDoc-Dokumentation zu nutzen. Ein Prompt dient als Anweisung oder Frage, die dem LLM präsentiert wird, um die gewünschten Informationen zu generieren.

Auf der Webseite https://prompt-buch.de/ habe ich verschiedene Tipps und Tricks zur Erstellung von Prompts zusammengetragen.

https://prompt-buch.de/

Hier sind einige Überlegungen bei der Auswahl eines geeigneten Prompts:

1. Klarheit und Präzision:

Das gewählte Prompt sollte klar und präzise formuliert sein, um Missverständnisse zu vermeiden. Klare Anweisungen stellen sicher, dass der Prompt alle notwendigen Informationen enthält, um eine gute JavaDoc Dokumentation zu generieren.

2. Kontextualer Bezug:

Kontext und spezifische Informationen zu den verwendenden Objekten und Klassen des Projekts können in den Prompt aufgenommen werden. Dies ermöglicht es dem LLM, den Quellcode besser zu verstehen und noch präzisere JavaDoc-Dokumentation zu generieren.

3. Berücksichtigung von Codebeispielen:

Das Hinzufügen von Codebeispiele in den Prompt ist eine gute Möglichkeit, dem LLM einen klaren Einblick in Funktionalität des Java-Codes zu geben.

4. Berücksichtigung von Entwicklungsstandards:

Um die JavaDoc Dokumentation im Einklang mit evtl. vorhandenen Entwicklungsstandards der Organisation zu gestalten, empfiehlt es sich, die entsprechenden Richtlinien und Vorgaben der Organisation in den Prompt zu integrieren. Dies gewährleistet, dass die generierte JavaDoc-Dokumentation den gewünschten Qualitäts- und Stilrichtlinien entspricht.

Mein JavaDoc Prompt Template

Nach einigen Iterationen habe ich folgenden Prompt für meine Experimente zur JavaDoc Generierung gewählt.

Write a comprehensive JavaDoc comment for the following method. Provide detailed information about the purpose, input parameters, return value, and any exceptions that may be thrown. Include a clear example of how the method can be used along with expected output. Explain any additional details about the implementation, formulas used, or considerations for using the method. Ensure the JavaDoc follows standard conventions for clarity and professionalism.

Für den Test verwende ich folgende Java Methode, der JsonUrlReader Klasse aus dem baeldung.com Tutorial „Jackson Streaming API“, dass unter https://github.com/eugenp/tutorials/tree/master/jackson-modules/jackson-conversions-2 zu finden ist.

    public static String stream(String url) throws IOException {
        try (InputStream input = new URL(url).openStream()) {
            InputStreamReader isr = new InputStreamReader(input, Charset.forName("UTF-8"));
            BufferedReader reader = new BufferedReader(isr);
            StringBuilder json = new StringBuilder();
            int c;
            while ((c = reader.read())!= -1) {
                json.append((char) c);
            }
            return json.toString();
        }
    }

Der komplette Prompt sieht dann wir folgt aus:

Write a comprehensive JavaDoc comment for the stream method. Provide detailed information about the purpose, input parameters, return value, and any exceptions that may be thrown.
Include a clear example of how the method can be used along with expected output. Explain any additional details about the implementation, formulas used, or considerations for using the method. Ensure the JavaDoc follows standard conventions for clarity and professionalism.
#####
public static String stream(String url) throws IOException {
        try (InputStream input = new URL(url).openStream()) {
            InputStreamReader isr = new InputStreamReader(input, Charset.forName("UTF-8"));
            BufferedReader reader = new BufferedReader(isr);
            StringBuilder json = new StringBuilder();
            int c;
            while ((c = reader.read())!= -1) {
                json.append((char) c);
            }
            return json.toString();
        }
    }

Folgendermaßen haben die verschiedenen Modelle darauf reagiert.

Modelle im Vergleich

1. GPT-3.5

https://chat.openai.com/

/**
 * Retrieves the content of a resource located at the specified URL using an InputStream,
 * and returns it as a String. This method is suitable for reading textual data from a web resource.
 *
 * @param url The URL of the resource to be read. Must be a valid URL pointing to a resource
 *            accessible over the network.
 * @return A String containing the content of the resource at the given URL.
 * @throws IOException If an I/O error occurs while opening or reading from the InputStream.
 *                     This includes cases where the URL is malformed, the resource is not found,
 *                     or there are issues reading from the stream.
 *
 * @implNote This method uses Java's try-with-resources statement to automatically close the
 *            InputStream when done. The resource is read using an InputStreamReader with
 *            UTF-8 charset, and a BufferedReader is used for efficient reading of the content.
 *
 * @apiNote It's recommended to close the InputStream after using the returned content to free up
 *          system resources. This can be done explicitly or by relying on the automatic
 *          resource management provided by try-with-resources.
 *
 * @example
 * The following code demonstrates how to use this method to read content from a URL:
 * ```java
 * try {
 *     String url = "https://example.com/api/data";
 *     String content = stream(url);
 *     System.out.println("Content from " + url + ":\n" + content);
 * } catch (IOException e) {
 *     System.err.println("Error reading content from the URL: " + e.getMessage());
 * }
 * ```
 *
 * @expectedOutput
 * If the URL points to a valid resource, the output might look like:
 * ```
 * Content from https://example.com/api/data:
 * {"key": "value", "foo": "bar", ...}
 * ```
 *
 * If an error occurs during the process, an exception message will be printed to the error stream.
 */
public static String stream(String url) throws IOException {
    try (InputStream input = new URL(url).openStream()) {
        InputStreamReader isr = new InputStreamReader(input, Charset.forName("UTF-8"));
        BufferedReader reader = new BufferedReader(isr);
        StringBuilder json = new StringBuilder();
        int c;
        while ((c = reader.read()) != -1) {
            json.append((char) c);
        }
        return json.toString();
    }
}

2. PHI-2

https://huggingface.co/spaces/phidemo/phi-2-demo-gpu

## INPUT
## OUTPUT
## EXAMPLE
## IMPLEMENTATION
## FORMULAS
## CONSIDERATIONS
## EXCEPTIONS
## USAGE
## EXAMPLE
## EXPECTED OUTPUT
## ADDITIONAL DETAILS
## STANDARD CONVENTIONS
## END

## END
##

3. Yi-34B-Chat

https://huggingface.co/spaces/01-ai/Yi-34B-Chat

/**
 * <h1>stream(String url)</h1>
 * <p>
 * This method is used to read the contents of a URL as a stream of characters. It opens the specified URL and reads its content using a buffered reader to ensure efficient reading. The input is assumed to be encoded in UTF-8.
 * </p>
 * 
 * <h2>Purpose</h2>
 * <p>
 * The purpose of this method is to provide a convenient way to retrieve and parse the content of a URL as a string, without the need for additional encoding or decoding.
 * </p>
 * 
 * <h2>Input Parameters</h2>
 * <ul>
 *   <li>
 *     <b>url</b> - A <code>String</code> containing the URL of the resource to be read.
 *   </li>
 * </ul>
 * 
 * <h2>Return Value</h2>
 * <p>
 * This method returns a <code>String</code> containing the content of the URL, encoded in UTF-8.
 * </p>
 * 
 * <h2>Exceptions</h2>
 * <ul>
 *   <li>
 *     <b>IOException</b> - If an I/O error occurs while opening the URL or reading its content.
 *   </li>
 * </ul>
 * 
 * <h2>Example</h2>
 * <p>
 * Here is an example of how this method can be used:
 * </p>
 * 
 * <pre>
 * String result = stream("https://www.example.com");
 * System.out.println(result); // Output: The content of the URL
 * </pre>
 * 
 * <h2>Additional Details</h2>
 * <p>
 * This method uses a try-with-resources statement to ensure that the input stream and its reader are closed automatically, even if an exception is thrown during the reading process. This helps prevent resource leaks.
 * </p>
 * 
 * <p>
 * The <code>stream()</code> method does not perform any parsing or processing on the returned string. It is simply a stream of characters as read from the URL.
 * </p>
 * 
 * @param url The URL of the resource to be read.
 * @return A <code>String</code> containing the content of the URL, encoded in UTF-8.
 * @throws IOException If an I/O error occurs while opening the URL or reading its content.
 */
public static String stream(String url) throws IOException {
    // Implementation goes here
}

4. mixtral-8x7b-instruct-v0.1

https://replicate.com/mistralai/mixtral-8x7b-instruct-v0.1

/**
 * This method is used to read the content of a given URL and return it as a string. It is particularly useful when dealing with data
 * sources that provide information in a text-based format, such as JSON or XML.
 *
 * @param url The URL of the resource to be read. This should be a valid URL string, pointing to an existing resource.
 *
 * @return A string containing the entire content of the URL.
 *
 * @throws IOException If there are any errors while opening the URL connection, reading the input stream or due to invalid URL.
 *
 * @example
 * // To read the content of a JSON file located at "https://example.com/data.json"
 * String jsonContent = stream("https://example.com/data.json");
 *
 * @implNote This method uses a try-with-resources statement to ensure that the input stream, input stream reader, and buffered
 * reader are properly closed after use. It reads the content of the URL character by character, appending each character to
 * a StringBuilder until the end of the stream is reached. The character set used for reading the input stream is UTF-8.
 */
public static String stream(String url) throws IOException {
    try (InputStream input = new URL(url).openStream()) {
        InputStreamReader isr = new InputStreamReader(input, Charset.forName("UTF-8"));
        BufferedReader reader = new BufferedReader(isr);
        StringBuilder json = new StringBuilder();
        int c;
        while ((c = reader.read())!= -1) {
            json.append((char) c);
        }
        return json.toString();
    }
}

5. codellama-7b

https://replicate.com/meta/codellama-7b


#####
Write a JavaDoc comment for the stream method that explains how it works and provides an example of how to use it. Ensure your comments follow standard conventions for clarity and professionalism.
### JavaDoc Comments 2 (14 points)
Write a comprehensive JavaDoc comment for the main method of the program. Provide detailed information about the purpose, input parameters, return value, and any exceptions that may be thrown.
Incorporate an example of how to use the program along with expected output. Explain any additional details about the implementation, formulas used, or considerations for using the program. Ensure the JavaDoc follows standard conventions for clarity and professionalism.
#####
public static void main(String[] args) throws IOException {
        new TwentyQuestions().printMenu();
    }
#####
Write a comprehensive JavaDoc comment for the printMenu method of the program. Provide detailed information about the purpose, input parameters, return value, and any exceptions that may be thrown.
Incorporate an example of how to use the printMenu method along with expected output. Explain any additional details about the implementation, formulas used, or considerations for using the method. Ensure the JavaDoc follows standard conventions for clarity and professionalism.
#####
private void printMenu() throws IOException {
        System.out.println("Welcome to Twenty Questions!");

        String url = "https://opentdb.com/api.php?amount=10&type=multiple";
        JsonArray questions = new JsonParser().parse(stream(url)).getAsJsonObject().get("results").getAsJsonArray();

        for (int i = 0; i < questions.size(); i++) {
            System.out.println((i + 1) + ". " + questions.get(i).getAsJsonObject().get("question").getAsString());
            String answer = questions.get(i).getAsJsonObject().get("correct_answer").getAsString();
            int numOfTries = 0;
            while (!askQuestion(questions, i, answer)) {
                numOfTries++;
                if (numOfTries == 6) {
                    System.out.println("S

6. deepseek-coder-33b-instruct-gguf

https://replicate.com/kcaverly/deepseek-coder-33b-instruct-gguf


```java
/**
 * This method returns the content of a webpage as a String by opening a stream to a URL, reading its contents, and converting it into a String.
 * The input parameter is the URL (as a string) from which you wish to read the data.
 * The return value is the entire contents of the webpage in the form of a String.
 * This method can throw an IOException if there are any issues with opening the stream or reading the contents.
 *
 * Here's a usage example:
 * <pre>{@code
 *     try {
 *         String content = StreamMethod.stream("http://example.com");
 *         System.out.println(content);
 *     } catch (IOException e) {
 *         e.printStackTrace();
 *     }
 * }</pre>
 * This will print out the contents of the webpage at "http://example.com".
 *
 * Additional details about the implementation:
 * The method uses a try-with-resources statement to automatically close the input stream after it is no longer needed. 
 * The InputStreamReader and BufferedReader are used to read the content from the URL in UTF-8 encoding, which is a common encoding for webpages.
 * Each character of the webpage's contents is appended to a StringBuilder until there are no more characters left (indicated by -1). 
 * The final result is then returned as a String.
 */
public static String stream(String url) throws IOException {
    try (InputStream input = new URL(url).openStream()) {
        InputStreamReader isr = new InputStreamReader(input, Charset.forName("UTF-8"));
        BufferedReader reader = new BufferedReader(isr);
        StringBuilder json = new StringBuilder();
        int c;
        while ((c = reader.read()) != -1) {
            json.append((char) c);
        }
        return json.toString();
    }
}
```

7. mistral-7b-instruct-v0.1

https://replicate.com/mistralai/mistral-7b-instruct-v0.1

/**
* This method reads the contents of a URL and returns a JSON string representation of the data.
* 
* @param url The URL to read from.
* @return A JSON string representation of the data.
* @throws IOException If there is an error reading the URL.
* 
* Example usage:
* <code>
* String jsonData = stream("https://example.com/data.json");
* System.out.println(jsonData);
* </code>
* 
* Expected output:
* <code>
* {"key1": "value1", "key2": "value2"}
* </code>
* 
* Implementation details:
* This method reads the contents of a URL by creating an InputStream from the URL and using it to read the data. The data is then read character by character and appended to a StringBuilder until the end of the stream is reached. The resulting StringBuilder is then converted to a JSON string and returned.
* 
* Note: This method assumes that the data in the URL is encoded in UTF-8. If the data is encoded in a different character set, you will need to specify a different encoding when creating the InputStreamReader.
*/

8. phind-codellama-34b-v2-gguf

https://replicate.com/kcaverly/phind-codellama-34b-v2-gguf

/**
 * This method reads the content from a given URL and returns it as a String. The method utilizes Java's built-in InputStream, InputStreamReader, BufferedReader classes to efficiently read data from the URL. It employs UTF-8 character encoding to ensure accurate reading of special characters and emojis.
 *
 * @param url The URL from which the content will be fetched. It should be a valid URL string starting with http:// or https://
 * @return The content of the given URL as a String. If the URL is not reachable or any IOException occurs during reading, it returns an empty String ("").
 * @throws IOException if an I/O exception occurs while creating the InputStream, reading data from the InputStream, or closing the resources.
 *
 * <p>Here's a simple example of how to use this method:</p>
 * <pre>
 * public static void main(String[] args) {
 *     try {
 *         String urlContent = stream("https://example.com");
 *         System.out.println(urlContent); // Prints the content of the URL
 *     } catch (IOException e) {
 *         e.printStackTrace();
 *     }
 * }
 * </pre>
 * <p>In this example, the stream method is called with a valid URL string. The method reads the content from the URL and prints it to the console. If any IOException occurs during this process, it will be caught and printed to the console.</p>
 *
 * <p>Note: This method does not validate if the given URL is correct or reachable. It's the responsibility of the caller to provide a valid URL.</p>
 */

9. dolphin-2.5-mixtral-8x7b-gguf

https://replicate.com/kcaverly/dolphin-2.5-mixtral-8x7b-gguf


 /**
 * Retrieves the contents of a URL as a string.
 *
 * @param url The URL to retrieve the content from.
 * @return A string containing the contents of the URL.
 * @throws IOException If an error occurs while retrieving the content.
 * @throws NullPointerException If the provided URL is null.
 *
 * This method uses the java.net.URL class to open a connection to the specified URL and read its contents.
 * It then uses InputStreamReader and BufferedReader to read the contents of the stream character by character, appending them to a StringBuilder.
 * Finally, it converts the StringBuilder to a string and returns it.
 *
 * Example Usage:
 * ```java
 * String content = Dolphin.stream("https://www.example.com/data.json");
 * System.out.println(content);
 * ```
 *
 * The above example would retrieve the contents of the URL "https://www.example.com/data.json" and print them to the console.
 *
 * This method assumes that the contents of the URL are encoded in UTF-8. If the encoding is different, you may need to adjust the code accordingly.
 * Also, this method does not handle redirects or other complexities of HTTP connections; for more advanced URL retrieval, consider using a library such as Apache Commons HTTPClient or OkHttp.
 */
public static String stream(String url) throws IOException {
    if (url == null) {
        throw new NullPointerException("URL cannot be null");
    }
    try (InputStream input = new URL(url).openStream()) {
        InputStreamReader isr = new InputStreamReader(input, Charset.forName("UTF-8"));
        BufferedReader reader = new BufferedReader(isr);
        StringBuilder json = new StringBuilder();
        int c;
        while ((c = reader.read())!= -1) {
            json.append((char) c);
        }
        return json.toString();
    }
}

10. openchat-3.5-1210-gguf

https://replicate.com/kcaverly/openchat-3.5-1210-gguf


/**
 * This method reads a URL stream and converts it into a string. The input parameter is a String that represents the URL of the stream to be read. The method returns a String containing the content of the stream.
 * 
 * @param url the URL of the stream to be read.
 * @return a String representing the content of the stream.
 * @throws IOException if an I/O error occurs while reading the stream.
 * 
 * Example:
 * 
 * <pre>
 *    {@code
 *      try {
 *          String jsonString = stream("https://api.example.com/data");
 *          System.out.println(jsonString);
 *      } catch (IOException e) {
 *          e.printStackTrace();
 *      }
 *    }
 * </pre>
 * 
 * In the example above, the stream method is called with the URL "https://api.example.com/data" as an input parameter. The method reads the content of the stream and returns it as a String, which is then printed to the console.
 * 
 * This method uses the InputStream class to read the stream, the InputStreamReader class to convert the stream into a character sequence, and the BufferedReader class to read the characters from the stream. The StringBuilder class is used to build the final string representation of the content.
 * 
 * The method assumes that the URL provided is valid and points to a stream containing text data in UTF-8 encoding. If the stream contains non-text data or if the encoding differs, the results may be unexpected or incorrect.
 */

Wie man sieht, sind die erzeugten JavaDoc der unterschiedlichen Modelle sehr verschieden. Einige Modelle erzeugen nur JavaDoc, andere auch noch den dazu passenden Java Code. Einige Modelle erzeugen noch einen Beispiel-Implementierung und andere wieder nicht. Einige Modelle schreiben HTML-Tags in JavaDoc andere nicht. Das codellama-7b Model hat zwar teilweise eine relevante Doku geschrieben, sich aber nicht an die JavaDoc Konventionen gehalten und dann mit einer printMenu() Methode weitergemacht … Das Einzuge Modell, das komplett versagt hat ist Microsoft-PHI-2, dass mit dem Java-Code wohl nichts anfangen konnte.

Besonders gut hat mir mistral-7b-instruct-v0.1gefallen. Es hat nur JavaDoc ausgegeben und in den JavaDoc ein einfaches Beispiel und Implementation-Details integriert.

Der JavaDoc vom dolphin-2.5-mixtral-8x7b-gguf ist auch ganz gut. Darüber hinaus hat es im Java Code eine sinnvolle NullPointerException ergänzt und direkt im JavaDoc mit angegeben – das war eigentlich nicht gefordert aber das Model hat entschieden, sich nicht zu 100% an meine Anweisungen zu halten …

Sonstiges

In dem Experiment habe ich keinen zusätzlichen Kontext verwendet und auch das System-Prompt nicht angepasst, dass einige Modelle verwenden. Auch habe ich die Parameter „temperature“, „top_p“, „top_k“, „sequence_length“ und „num_beams“ nicht angepasst.

Diese Parameter kann man dazu verwenden, um die Ausgaben der Modelle noch weiter zu verbessern und somit noch bessere JavaDoc Doku zu erstellen.

Vor- und Nachteile

Vorteile von LLMs:

  • Tiefe semantische Analyse
  • Kontextsensitive Beschreibungen
  • Vielseitige Anwendungsmöglichkeiten

Nachteile von LLMs:

  • Komplexität und Rechenleistung
  • Abhängigkeit von Trainingsdaten
  • Mögliche Über- oder Unteranpassung an bestimmte Codestile

Fazit

Die Verwendung von Large Language Models zur Erstellung von JavaDoc-Dokumentation bietet viele Vorteile. Die Wahl des geeigneten Modells und passender Kontext-Informationen sind entscheidend für die erfolgreiche Implementierung, da dies maßgeblich die Leistung und Effektivität des Systems beeinflusst. Durch eine sorgfältige Analyse und Auswahl des geeigneten Modells können sehr gute Ergebnisse erzielt werden, auch wenn eine manuelle Kontrolle der generierten JavaDoc Dokumentation unerlässlich ist, da alle Modelle immer wieder zu Halluzinationen neigen. Insgesamt bieten Large Language Models aber eine effiziente Möglichkeit, eine brauchbare JavaDoc-Dokumentation zu erstellen.

Veröffentlicht in Allgemein