Enterprise TensorFlow - Executing a TensorFlow Session in Java

A TensorFlow Session can be executed in Java in the same way as in Python. This post shows how.

We have managed to load a TensorFlow SavedModel in Java. Now it is time to get results out of the model. Luckily, the idiom for this is just the same as in low-level TensorFlow. All we need to do is identify the nodes that define the input and output of our computing graph, wrap data in Tensors and run a session.

As with the previous posts, you can find the complete working code on github.

Running a TensorFlow session in Java requires the following steps:

Wrap your input data in Tensorobjects using the static helpers in the Tensors class
Get a Session object
Create a Runner object for the session
Assign the input Tensors to the proper nodes in your graph with Runner.feed
Define the output you want returned with Runner.fetch
Execute the Computation with Runner.run
Unwrap the result Tensors using one of the Tensor’s convenience methods or a copyTo call
Make sure you close all Tensorobjects

This might look quite daunting, but is very simple in practice, thanks to a well documented API, lots of helper functions and a nice fluent Interface.

Wrapping data in `org.tensorflow.Tensor` objects

The Tensor class is the most important class when using the TensorFlow Java Wrapper. It is used to wrap and unwrap data to feed it to the TensorFlow engine and get results back. The most complicated part of running our model in java is correctly wrapping and unwrapping our data. Luckily, if we do something wrong, the resulting error messages are very meaningful and verbose, so this is normally an easy job.

In 99% of all use cases, you can simply call one of the helper methods in the Tensorsclass to create a Tensor of the proper shape, data type and content. Here is an example of wrapping a single input floatvalue:

final Tensor<Float> t = Tensors.create(f);

There are helper methods for all data types and up to six dimensions, so you should find everything you need there. As we will sell later, you still may want to wrap all Tensor creation in function calls of your own to make resource handling a bit easier - you must make sure to call .close() on all created Tensors!

In very, very rare cases you may have to resort to create calls on the Tensor class itself - this will allow you to create a Tensorof any shape. For completeness sake, here is an example of manually wrapping a float in a Tensor (do not do this unless you absolutely have to):

final Tensor<Float> t = Tensor.create(
    new long[] {1}, // the shape
    FloatBuffer.wrap(new float[] {f}) // the data
);

Running a session and retrieving results

As in the Python low-level API, a model is executed in a session. To get a handle to a Session object, we just call the SavedModelBundle.session() method. The Session object is in turn used to get a Runner. The Runner provides a fluent API that is used to bind Tensors to nodes in the graph with Runner.feed and to define which Tensors to return after the computation is complete with Runner.fetch. The fluent API works like a Builder, each call again returns the Runnerso we can chain calls. When everything is wired, we call Runner.run() to perform the computation and return the result. The result is a list of Tensors, the number of elements in the list depends on the number of Runner.fetch calls, each call will create an additional List entry. This is a complete result chaining all calls into one long statement:

final Tensor<?> result = 
    // gets the session
    bundle.session() 
    // creates a runner
    .runner() 
    // binds tensors to input nodes in the graph, in our case 
    // `values` is an array of floats, toTensor creates a Tensor
    // object, the first argument is a string with the name of 
    // the input node
    .feed("wine_type"           , toTensor(values[1], tensorsToClose))
    .feed("fixed_acidity"       , toTensor(values[2], tensorsToClose))
    .feed("volatile_acidity"    , toTensor(values[3], tensorsToClose))
    .feed("citric_acid"         , toTensor(values[4], tensorsToClose))
    .feed("residual_sugar"      , toTensor(values[5], tensorsToClose))
    .feed("chlorides"           , toTensor(values[6], tensorsToClose))
    .feed("free_sulfur_dioxide" , toTensor(values[7], tensorsToClose))
    .feed("total_sulfur_dioxide", toTensor(values[8], tensorsToClose))
    .feed("density"             , toTensor(values[9], tensorsToClose))
    .feed("ph"                  , toTensor(values[10], tensorsToClose))
    .feed("sulphates"           , toTensor(values[11], tensorsToClose))
    .feed("alcohol"             , toTensor(values[12], tensorsToClose))
    // define which output tensor to return
    // (you can chain multiple `fetch` calls to 
    // return more then one tensor)
    .fetch("dnn/head/logits:0")
    // execute the runner - this returns a list
    .run()
    // We have only one fetch call, so we get a 
    // one-element-list. The `get(0)` call fetches
    // the first element of the list
    .get(0);

Unwrapping resulting `Tensor`s

What is left now is to get the result out of the Tensor returned by the run() call. If the result Tensor is simply a scalar, you can just call Tensor.floatValue(), Tensor.booleanValue() etc. If the resulting tensor is not a scalar, the resulting data needs to be retrieved with Tensor.copyTo(U destination), where destination is a multidimensional array. Prepackaged neural network regression estimators for example always return a two dimensional tensor, even if you only have one single numerical result. In that case, you can retrieve the result like this:

float[][] resultValues = (float[][]) result.copyTo(new float[1][1]);
float prediction = resultValues[0][0];

The type and number of dimensions of the array depends on your model.

Resource management

Two types of objects will need manual closing for proper resource handling: Sessions and Tensors. Note that allTensor objects - whether created manually or returned from running a session - must be closed manually. I prefer to do this by performing all Tensor creation in helper functions that collect all created Tensors in a Collection and then free everything in a finally block after I am done:

private static Tensor<Float> toTensor(final float f, 
        final Collection<Tensor<?>> tensorsToClose) 
{
    final Tensor<Float> t = Tensors.create(f);
    if (tensorsToClose != null) {
        tensorsToClose.add(t);
    }
    return t;
}       

private static void closeTensors(final Collection<Tensor<?>> ts) {      
    for (final Tensor<?> t : ts) {
        try {
            t.close();
        } catch (final Exception e) {
            // TODO: decide on the error handling best fitting your use case here
            // In most cases logging is the only useful thing left to do
            System.err.println("Error closing Tensor.");
            e.printStackTrace();
        }
    }
    ts.clear();
}

private void runSession(final float foo, /* more params here */) {
    final List<Tensor<?>> tensorsToClose = new ArrayList<Tensor<?>>(); 
    try {            
        // run session
        final List<Tensor<?>> result = bundle.session().runner()
            .feed("foo", toTensor(foo, tensorsToClose))
            // ... feed more tensors as necessary ...
            .fetch("some_node")
            // ... fetch more tensors as necessary ...
            .run(); 
        // mark result for cleanup
        tensorsToClose.addAddAll(result);
        // ... do something with the result ...
    } finally {
        closeTensors(tensorsToClose);
    }
}

Note the absence of Session closing: The session is created once for the SavedModelBundle, the session() call returns an existing reference, not a new session. The Session is thread-safe, so it can be reused everywhere. It only needs closing when you are completely done. So you should close the Session only at the end of your program or when you shut down your server. You can simply do this by closing your SavedModelBundle, which frees all resources associated with the SavedModel. (You may even omit closing the Session, as the end of your JVM process should free all resources associated with it anyway - I never had any negative effects, but do this at your own risk!)

Determining the proper names for input and output nodes

If you have written your own Estimatoryou probably know how the input and output nodes that you need to call, as well as their shapes. Sometimes however, you might have used a prepackaged Estimator, where you do not know how the output nodes are called or you may not have written the model yourself and need to inspect the saved data to know what to call. In that case, like for the tag necessary to load the necessary MetaGraphs in the previous post, you need to inspect your SavedModel on the command line to determine your tag, your input and output node names, their shapes and data types. This can be done by successive calls to the saved_model_cli like this (we use the SavedModel from our example project here, your output will obviously depend on the model you use):

chris$ ~/Library/Python/3.6/bin/saved_model_cli show \ 
--dir saved_models/1512127459/
The given SavedModel contains the following tag- sets:
serve

chris$ ~/Library/Python/3.6/bin/saved_model_cli show \
--dir saved_models/1512127459/ \
--tag_set serve
The given SavedModel MetaGraphDef contains
SignatureDefs with the following keys:
SignatureDef key: "predict"

chris$ ~/Library/Python/3.6/bin/saved_model_cli show\
--dir saved_models/1512127459/ \
--tag_set serve \
--signature_def predict
The given SavedModel SignatureDef contains the following input(s):
inputs['wine_type'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: wine_type:0
...
The given SavedModel SignatureDef contains the
following output(s):
outputs['predictions'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: dnn/head/logits:0
Method name is: tensorflow/serving/predict

As you can see, we need successive calls to saved_model_cli show to “dig deeper” into our SavedModel to determine the names, shapes and datatypes of the input and output tensors. Regrettably, this information cannot be retrieved generically with the Java API (AFAIK). The type and shape of result Tensors however can also be examined by calls to Tensor.shape() and Tensor.dataType().

Summary

Running a TensorFlow session in Java is pretty easy, just remember:

SavedModelBundle and the corresponding Session are thread-safe, Tensors are not
Use the saved_model_cli to determine the name and shape of your input and output nodes
Wrap your input data with the helper methods in the Tensors class
Use the fluent API on the SavedModelBundle to aquire and run a session: bundle.session().runner().feed(...).fetch(...).run()
For scalar results: Use Tensor.floatValue() etc. to retrieve data from the resulting Tensors.
For non-scalar results: Use the proper array type and shape to retrieve data from your resulting Tensors using Tensor.copyTo
Only call close on your SavedModelBundle when you are completely done and want to shut down your JVM, e.g. on server shutdown

07 Aug 2025
Python Dependency Management with Fewer Headaches
One of the most painful parts of developing neural networks is dependency management in Python. It seems like Python has reinvented multiple wheels that other languages like JAVA have been merrily rolling along on for multiple years. Ironically, Python packages are actually called wheels. Oh well. In this short post we want to show you our solution to this problem for Deep Learning (DL) projects, where this problem is particularly nasty as you also need to juggle multiple CUDA versions. Note that there are multiple ways to deal with this - this just happens to be the one we like most - maybe you will as well?
weiterlesen
25 Feb 2025
Working with Ollama, Part 2
In the first part of our article on Ollama, we demonstrated how to install Ollama and local models. In this second part, we cover advanced usage of Ollama by customizing modelfiles and integrating with the AnythingLLM frontend. We show how these tools make managing and utilizing local AI models more efficient.
weiterlesen
24 Feb 2025
Working with Ollama, Part 1
In the first installment of our two-part series “Working with Ollama,” we introduce the open-source, cross-platform solution Ollama, which simplifies both the management and usage of AI models.
weiterlesen
08 Apr 2024
Whisper 3 Large for JAVA
For an internal product prototype we have traced OpenAI’s Whisper 3 model from Huggingface and made it usable under JAVA via DJL.
weiterlesen
14 Jun 2023
ChatGPT for Teams: Privacy-Compliant Use in the Workplace
In today’s digital business world, AI-powered communication platforms like ChatGPT are essential for tasks such as answering complex code questions or creating top-notch texts for offers. However, in companies dealing with sensitive customer data, using ChatGPT can lead to a data protection dilemma. While ChatGPT offers an option to prevent the use of chat conversations for training purposes, it comes with certain limitations. Moreover, as of June 2023, there is no way to manage multiple team members or users through a company account. Each user must register individually and use their own email, phone number, and credit card. If you want to use ChatGPT+, for example, you cannot pay for all users with one credit card. Individual invoices also end up with individual users, creating an organizational and accounting nightmare. We at DIVISO have also grappled with this issue and went in search of a solution.
weiterlesen
25 Oct 2021
Git as a management tool for training data and experiments in ML
In this part of the series of articles on MLOps, we start with information that will be familiar to most of you: With the basics of Git. However, to give a different perspective on the well-known tool, these basics provide the basis to highlight the function and benefits of Git for machine learning (ML) and the difference in managing training data.
weiterlesen
02 Aug 2021
MLOps: Establishment and operation of an AI
With Machine Learning Operations (MLOps) we ensure that data is efficiently and strategically integrated into business processes through regular and automated training, thus contributing to increased revenue. The challenge is to establish and maintain these automated processes.
weiterlesen
31 Aug 2020
Types of Artificial Neural Networks
In our real-world example, we used a “feed-forward neural network” to recognise handwritten numbers. This is probably the most basic form of a NN. In reality, however, there are hundreds of types of mathematical formulas that are used – beyond addition and multiplication – to compute steps in a neural network, many different ways to arrange the layers, and many mathematical approaches to train the network.
weiterlesen
17 Jul 2020
Amazon DJL - a new DL framework for Java
Developers who wanted to explore neural networks and deep learning using the JVM, and especially Java, had little choice so far. Those who wanted to focus exclusively on Java could not get around DL4J until now. If it had to be the JVM, but not necessarily Java, the MXNet Scala Frontend was also an option. Finally, if a little Python didn’t scare you, you could try a hybrid solution, combining TensorFlow and Java just like we already explained in previous articles.
weiterlesen
29 Jun 2020
NLP, NLU and NLG: AI and text
So far, we have generally steered clear of the areas of text comprehension and text generation by ML in our practical examples for the basic understanding of AI. For good reason, we have focused primarily on two types of problems: classification of images and prediction of numerical values.
weiterlesen
23 Jun 2020
Neural networks - The five most common mistakes
AI and especially Neural Networks or Deep Learning have been the technological hype topic for some years now. However, since the subject is quite abstract – one could say it is uncharted territory for most people – we want to clear up some mistakes that we often encounter in our work.
weiterlesen
02 Jun 2020
What are Neural Networks and how do they work?
In our past articles we mainly covered the basics of current AI research and tried to shed some light on them in a way that is understandable for non-IT scientists. We are now proceeding to the probably “hottest” current AI topic: Neural Networks (NN).
weiterlesen
12 May 2020
Deep Java Learning Introduction - Part 1: NDManager & NDArray
After our first presentation of Amazon’s new Deep Learning Framework for Java, DJL, we now want to introduce the basics of Deep Learning under Java with DJL step by step in a series of beginner posts. This is not about quickly copying code snippets, but about really understanding the framework and the concepts.
weiterlesen
11 May 2020
Deep Fakes - How to spot faked Images
A (fairly) new kind of neural networks, so-called Generative Adversarial Networks or GANs, are nowadays capable of generating deceptively real images of people that do not actually exist. These fake images are indistinguishable from real photos at first glance. Fortunately, you might still uncover them if you look closely – if you know what to look for!
weiterlesen
28 Jun 2019
Recap: ML Conference 2019 in Munich
On 17.06. another round of the semi annual ML Conference started in Munich. As usual, it started with a day-long workshop with joint live coding, giving the participants an approachable introduction into Machine Learning and Deep Learning.
weiterlesen
24 May 2019
Understanding AI - Part 5: Supervised & Unsupervised Learning in ML
In the previous article we introduced the basic concepts of Machine Learning and how the training of an ML model works, using a simple but practical algorithm. Next, we want to take a closer look at the different types of Machine Learning.
weiterlesen
14 May 2019
BGL symposium 2019 - lecture 'AI and Magic'
“Any sufficiently advanced technology is indistinguishable from magic.” – Arthur C. Clarke JAX 2019 is barely over, but Christoph is already on the podium for the next talk. At the symposium of the BLG (Federal Association of Industrial Photographic Laboratories), his lecture will cover “AI and Magic – How does Artificial Intelligence work?
weiterlesen
29 Apr 2019
Jax 2019 Recap
JAX 2019 is approaching and once again Christoph is contributing two sessions. This year he’s focussing on Neural Networks and explains how to use TensorFlow-Training while working with JVM.
weiterlesen
25 Apr 2019
Understanding AI - Part 4: The basics of Machine Learning
After shedding some light onto Symbolic AI in the previous article, we’re now moving on to take a closer look at Machine Learning (ML). When it comes to Symbolic AI, breaking down a problem as minutely as possible is key for successfully solving it.
weiterlesen
08 Apr 2019
Understanding AI - Part 3: Methods of symbolic AI
In the previous article we added two distinctions to our initial definition of AI: On the one hand we distinguish between strong and weak AI (Terminator & Science Fiction vs. the scientific status quo). Also we pointed out the difference between symbolic AI and Machine Learning.
weiterlesen
21 Mar 2019
Understanding AI - Part 2: Symbolic AI, Neural Networks and Deep Learning
Artificial Intelligence (AI) is as old as computer science itself. Calculations, logical deductions, complex assignments… all this was once restricted to humans, until computers came forth.
weiterlesen
07 Mar 2019
Understanding AI - Part 1: What is AI?
From household help to doomsday scenario - there’s hardly a topic where public perception, state of research and reality seem so incongruent as with artificial intelligence. Reason enough to shed some light onto this subject with a series of articles.
weiterlesen
06 Aug 2018
DL4J Workshop at the ML Summit in Berlin
On October 1st and 2nd the first ML Summit takes place in Berlin. In 12 workshops in three parallel tracks, experts impart practical knowledge on the topics Applications for Business, Machine Learning Basics & Tools and Specialized Topics.
weiterlesen
23 Apr 2018
Jax 2018 - Talks about DL4J and more
Christoph will give two talks about Java and Machine Learning at JAX 2018
weiterlesen
23 Jan 2018
Enterprise TensorFlow 3 - Loading a SavedModel in Java
Part 3 in the series about Java / TensorFlow Interoperability, showing how to load a TensorFlow SavedModel in Java
weiterlesen
22 Jan 2018
Enterprise TensorFlow 2 - Saving a trained model
Part 2 in the series about Java / TensorFlow Interoperability, discussing how to save a model so it can be reused in a different environment.
weiterlesen
11 Jan 2018
TensorFlow and Java - An interview with entwickler.de
Our CTO was interviewed about TensorFlow / Java Interoperability while at ML Conference 2017 in Berlin.
weiterlesen
08 Jan 2018
Enterprise Tensorflow: Code Examples
Overview over the example projects for TensorFlow / Java integration
weiterlesen
30 Nov 2017
Enterprise Tensorflow - Java vs. Python
This is the first part of a series of posts about Java and Tensorflow interop. It is a more extensive version of my talk at ML Conference 2017 in Berlin
weiterlesen
15 Nov 2017
ML Conference 2017 in Berlin
An announcement for my presentation at the ML Conference 2017 in Berlin
weiterlesen

Enterprise TensorFlow - Executing a TensorFlow Session in Java

Wrapping data in org.tensorflow.Tensor objects

Running a session and retrieving results

Unwrapping resulting Tensors

Resource management

Determining the proper names for input and output nodes

Summary

Python Dependency Management with Fewer Headaches

Working with Ollama, Part 2

Working with Ollama, Part 1

Whisper 3 Large for JAVA

ChatGPT for Teams: Privacy-Compliant Use in the Workplace

Git as a management tool for training data and experiments in ML

MLOps: Establishment and operation of an AI

Types of Artificial Neural Networks

Amazon DJL - a new DL framework for Java

NLP, NLU and NLG: AI and text

Neural networks - The five most common mistakes

What are Neural Networks and how do they work?

Deep Java Learning Introduction - Part 1: NDManager & NDArray

Deep Fakes - How to spot faked Images

Recap: ML Conference 2019 in Munich

Understanding AI - Part 5: Supervised & Unsupervised Learning in ML

BGL symposium 2019 - lecture 'AI and Magic'

Jax 2019 Recap

Understanding AI - Part 4: The basics of Machine Learning

Understanding AI - Part 3: Methods of symbolic AI

Understanding AI - Part 2: Symbolic AI, Neural Networks and Deep Learning

Understanding AI - Part 1: What is AI?

DL4J Workshop at the ML Summit in Berlin

Jax 2018 - Talks about DL4J and more

Enterprise TensorFlow 3 - Loading a SavedModel in Java

Enterprise TensorFlow 2 - Saving a trained model

TensorFlow and Java - An interview with entwickler.de

Enterprise Tensorflow: Code Examples

Enterprise Tensorflow - Java vs. Python

ML Conference 2017 in Berlin

Wrapping data in `org.tensorflow.Tensor` objects

Unwrapping resulting `Tensor`s