Crate tabula[][src]

Expand description

Rust bindings for tabulapdf/tabula-java

Prerequisites

In order to use tabula-rs, you will need a tabula-java bytecode archive (jar). You can build it yourself by cloning ssh://git@github.com/tabulapdf/tabula-java.git and then running invoking maven to build it.

git clone git@github.com:tabulapdf/tabula-java.git && cd tabula-java
mvn compile assembly:single

the built archive should then be target/tabula-$TABULA_VER-jar-with-dependencies.jar.

Additionally, make sure $JAVA_HOME/lib/server/libjvm.so is reachable through LD_LIBRARY_PATH or explicitly set it as LD_PRELOAD.

Using tabula-rs

Initalizing JVM & accessing JNI

in order to make use of tabula-java, you’ll need to start jni::JavaVM with the built archive added to its classpath. You could either do this manually, or call TabulaVM::new()` with the (space escaped) path to the archive as parameter.

Using TabulaVM you can now access the Java native interface by calling TabulaVM::attach().

let vm = TabulaVM::new("../tabula-java/target/tabula-1.0.6-SNAPSHOT-jar-with-dependencies.jar", false).unwrap();
let env = vm.attach().unwrap();

Instantiating Tabula class

with access to the JNI you can instantia the Tabula class by calling TabulaEnv::configure_tabula().

let tabula = env.configure_tabula(None, None, OutputFormat::Csv, true, ExtractionMethod::Basic, false, None).unwrap();

Parsing the document

Tabula provides Tabula::parse_document() that then parses a document located a its given path and returns a std::fs::File located in memory.

let file = tabula.parse_document(&std::path::Path::new("./test_data/spanning_cells.pdf"), "test_spanning_cells").unwrap();

Re-exports

pub use jni;

Structs

Oxidized technology.tabula.Rectangle

Tabula class

Java native interface capable of instantiating Tabula class

Java VM capable of using Tabula

Enums

Oxidized technology.tabula.CommandLineApp$ExtractionMethod

Oxidized technology.tabula.CommandLineApp$OutputFormat

Constants

Type Definitions

Result returned from JNI