Fuseki SPARQL server
For the purpose of this tutorial I will use the database name test
requirements
- Java Development Kit:
apt-get install default-jdk
Run Fuseki as a systemd service
As root go to
cd /usr/local/src
Download & untar
wget https://apache.redkiwi.nl/jena/binaries/apache-jena-fuseki-3.15.0.tar.gz tar xfvz apache-jena-fuseki-3.15.0.tar.gz cd apache-jena-fuseki-3.15.0
Try running fuseki-server:
./fuseki-server
- it will create the
run/
directory, with config files, dataset and backups directories - it will not run in JDK is not installed
Fuseki File Layout
I will follow the Filesystem layout suggested by the official documentation for [1] for running Fuseki as a service
Environment Variable Default Setting FUSEKI_HOME /usr/share/fuseki FUSEKI_BASE /etc/fuseki
- FUSEKI_HOME(Distribution area) – a is essentially the fuseki-server binary and a few helper scripts
- FUSEKI_BASE(Runtime area) – is a directory that contains the configuration, datasets, logs - which should be backup and not changed with updates of the Fuseki binaries.
So let's go ahead and create those directories and move the corresponding files to the right dir
mkdir /usr/share/fuseki mkdir /etc/fuseki
cp -r {fuseki,fuseki-server,fuseki-server.bat,fuseki-server.jar,fuseki.war,bin,webapp} /usr/share/fuseki/ cp -r run/* /etc/fuseki/
cp log4j2.properties /etc/fuseki/
And we can make a test run by running:
/usr/share/fuseki/fuseki-server
And checking the if the server is up by visiting http://localhost:3030/index.html
Service file
Inside the untared dir /usr/local/src/apache-jena-fuseki-3.15.0
you can find the file fuseki.service
This file should be copied to /etc/system.d/system and edited in order to run Fuseki as a service. The file itself is quite self explanatory. I have added a few changes, mainly in relation to logging.
cp fuseki.service /etc/systemd/system
vi /etc/systemd/system/fuseki.service
[Unit] Description=Fuseki [Service] # Edit environment variables to match your installation Environment=FUSEKI_HOME=/usr/share/fuseki Environment=FUSEKI_BASE=/etc/fuseki # Edit the line below to adjust the amount of memory allocated to Fuseki Environment=JVM_ARGS=-Xmx4G # Edit to match your installation ExecStart=/usr/share/fuseki/fuseki-server # Run as user "fuseki" User=root Restart=on-abort # Java processes exit with status 143 when terminated by SIGTERM, this # should be considered a successful shutdown SuccessExitStatus=143 ### By default, the service logs to journalctl only. StandardOutput=file:/var/log/fuseki/access.log StandardError=file:/var/log/fuseki/stderrout.log #StandardOutput=syslog #StandardError=syslog #SyslogIdentifier=fuseki ### This logs to syslog. If, e.g., rsyslogd is used, you can provide a file ### /etc/rsyslog.d/fuseki.conf, consisting of the following two lines (uncommented) #if $programname == 'fuseki' then /var/log/fuseki/stderrout.log #if $programname == 'fuseki' then stop [Install] WantedBy=multi-user.target
Create logs dir:
mkdir /var/log/fuseki/
Enable and run the service:
systemctl enable fuseki systemctl start fuseki
Check it's status
systemctl status fuseki
And again check its web UI at http://localhost:3030
You can try to create a dataset and perform an INSERT statement, which be stored in /etc/fuseki/databases/databasename/
Security
Fuseki security settings are defined in file /etc/fuseki/shiro.ini [2]
Here I can change the default admin password
Note:
- ensure the admin password is something strong
And also the rights to Fuseki Administative HTTP protocol[3] and access to the existing datasets.
I will allow the dataset test
to be queried by anyone but only updated by the locahost, so that the SMW can write to it, but requests coming from outside cannot update it, but can query, with:
/test/query = anon /test/update = localhostFilter
Full shiro.ini
[main] ssl.enabled = false plainMatcher=org.apache.shiro.authc.credential.SimpleCredentialsMatcher #iniRealm=org.apache.shiro.realm.text.IniRealm iniRealm.credentialsMatcher = $plainMatcher localhostFilter=org.apache.jena.fuseki.authz.LocalhostFilter [users] # Implicitly adds "iniRealm = org.apache.shiro.realm.text.IniRealm" admin=pw123 [roles] [urls] # All admin operations have URL paths starting /$/ to avoid clashes with dataset names and this prefix is reserved for the Fuseki control functions. /$/status = anon /$/ping = anon /$/stats/** = anon # oooowiki dataset /test/query = anon /test/update = localhostFilter # everything else is accessible to admin /** = authcBasic,user[admin]
To test we should restart fuseki systemctl restart fuseki
and run a few requests, from both localhost and external host:
curl http://localhost:3030/$/status -X POST -H 'Accept: application/sparql-results+json,*/*;q=0.9'
A query coming from both local and remote hosts, where both should succeed:
curl http://localhost:3030/test/query -X POST --data 'query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX+country%3A+%3Chttp%3A%2F%2Feulersharp.sourceforge.net%2F2003%2F03swap%2Fcountries%23%3E%0ASELECT+*%0A%7B%0A+%3Fs+foaf%3Aname+%22Test%22%40en+.%0A%7D' -H 'Accept: application/sparql-results+json,*/*;q=0.9'
curl http://10.0.0.20:3030/test/query -X POST --data 'query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX+country%3A+%3Chttp%3A%2F%2Feulersharp.sourceforge.net%2F2003%2F03swap%2Fcountries%23%3E%0ASELECT+*%0A%7B%0A+%3Fs+foaf%3Aname+%22Test%22%40en+.%0A%7D' -H 'Accept: application/sparql-results+json,*/*;q=0.9'
An update, which should succeed only when coming from localhost, and fail when coming from remote
curl http://localhost:3030/test -X POST --data 'update=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX+country%3A+%3Chttp%3A%2F%2Feulersharp.sourceforge.net%2F2003%2F03swap%2Fcountries%23%3E%0AINSERT+DATA%0A%7B%0A+country%3Aooo+foaf%3Aname+%22OOOOO%22%40en+.%0A%7D' -H 'Accept: text/plain,*/*;q=0.9'
curl http://10.0.20.2:3030/test -X POST --data 'update=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX+country%3A+%3Chttp%3A%2F%2Feulersharp.sourceforge.net%2F2003%2F03swap%2Fcountries%23%3E%0AINSERT+DATA%0A%7B%0A+country%3Aooo+foaf%3Aname+%22OOOOO%22%40en+.%0A%7D' -H 'Accept: text/plain,*/*;q=0.9'
webserver proxy
Instead of calling localhost:3030
to access fuseki, we I have added an new virtual host to my domain apache config, to allow Fuseki to be accessed via the, subdomain sparql.oooooooooo.io,
<VirtualHost *:80>
ServerName sparql.oooooooooo.io
ProxyRequests Off
ProxyPass / http://your.server.IP:3030/
ProxyPassReverse / http://your.server.IP:3030
</VirtualHost>
Note: Although it would be possible to do the the ProxyPass to the localhost (http://127.0.0.1:3030), this would make Fuseki extremely vunerable, as the webserver would turn every request to fuseki into a request coming from from the LocalHost, making shiro.ini localhostFilter
useless.
Reload apache site conf
And test:
curl http://sparql.oooooooooo.io/$/status -X POST -H 'Accept: application/sparql-results+json,*/*;q=0.9'
Logging
Apache Fuseki Logging is performed via SLF4J over Apache Log4J2.
The Fuseki engine looks for the log4j2 configuration, in different locations, but I will stick with:
filelog4j2.properties
in the directory defined by FUSEKI_BASE:/etc/fuseki
log4j2.properties
was in previous steps copied to /etc/fuseki/
and includes the default logging settings.
Backups
It is possible to send a POST request for Fuseki to create a backup of a database, with
curl http://localhost:3030/$/backup/test -X POST -H 'Accept: application/sparql-results+json,*/*;q=0.9'
Here the test dataset will be backup up, and stored in /etc/fuseki/backups
under a gzip-compressed N-Quads file test_2020-05-23_14-42-41.nq.gz
Which, when decompressed, will show a contained list of triples which makes up the dataset:
gzip -d test_2020-05-23_14-42-41.nq.gz
cat test_2020-05-23_14-42-41.nq
<http://eulersharp.sourceforge.net/2003/03swap/countries#zx> <http://xmlns.com/foaf/0.1/name> "Zexix"@en . <http://eulersharp.sourceforge.net/2003/03swap/countries#zy> <http://xmlns.com/foaf/0.1/name> "Zyz"@en . <http://eulersharp.sourceforge.net/2003/03swap/countries#ou> <http://xmlns.com/foaf/0.1/name> "Ouoaoo"@en . ...
/$/backups-list
will show the list of existing backups:
curl http://localhost:3030/$/backups-list -X POST -H 'Accept: application/sparql-results+json,*/*;q=0.9'
{ "backups" : [ "test_2020-05-23_14-42-41.nq" , "test_2020-05-23_14-42-46.nq.gz" ] }