Working with Objects
Copied from Hibernate Reference Manual - please re-write.
Topaz is a full object/relational mapping solution that not only shields the developer from the details of the underlying triple-stores and blob-stores but also offers state management of objects. This is, contrary to the management of TQL/SPARQL statements in common RDF persistence layers, a very natural object-oriented view of persistence in Java applications.
In other words, Topaz application developers should always think about the state of their objects, and not necessarily about the execution of TQL/SPARQL statements. This part is taken care of by Topaz and is only relevant for the application developer when tuning the performance of the system.
Topaz object states
Topaz defines and supports the following object states:
- Transient - an object is transient if it has just been instantiated using the new operator, and it is not associated with a Topaz Session. It has no persistent representation in the triple-store and no identifier value has been assigned. Transient instances will be destroyed by the garbage collector if the application doesn't hold a reference anymore. Use the Topaz Session to make an object persistent (and let Topaz take care of the TQL/SPARQL statements that need to be executed for this transition).
- Persistent - a persistent instance has a representation in the triple-store and an identifier value. It might just have been saved or loaded, however, it is by definition in the scope of a Session. Topaz will detect any changes made to an object in persistent state and synchronize the state with the triple-store when the unit of work completes. Developers don't execute manual 'insert' statements, or 'delete' statements when an object should be made transient.
- Detached - a detached instance is an object that has been persistent, but its Session has been closed. The reference to the object is still valid, of course, and the detached instance might even be modified in this state. A detached instance can be reattached to a new Session at a later point in time, making it (and all the modifications) persistent again. This feature enables a programming model for long running units of work that require user think-time. We call them application transactions, i.e. a unit of work from the point of view of the user.
We'll now discuss the states and state transitions (and the Topaz methods that trigger a transition) in more detail.
Making objects persistent
Newly instantiated instances of a persistent class are considered transient by Topaz. We can make a transient instance persistent by associating it with a session:
DomesticCat fritz = new DomesticCat(); fritz.setColor(Color.GINGER); fritz.setSex('M'); fritz.setName("Fritz"); String generatedId = sess.saveOrUpdate(fritz);
If Cat has a generated identifier, the identifier is generated and assigned to the cat when saveOrUpdate() is called. If Cat has an assigned identifier, the identifier should be assigned to the cat instance before calling saveOrUpdate().
saveOrUpdate() does guarantee to return an identifier. This works fine as long as the Identifier generator does not store the next sequence in the triple-store and needs to do a commit. The default generator in Topaz does not require writing anything to the triple-store.
If the object you make persistent has associated objects (e.g. the kittens collection in the previous example), these objects may be made persistent in any order you like.
Usually you don't bother with this detail, as you'll very likely use Topaz's transitive persistence feature to save the associated objects automatically. Transitive persistence is discussed later in this chapter.
Loading an object
The get() methods of Session gives you a way to retrieve a persistent instance if you already know its identifier. get() takes a class object and will load the state into a newly instantiated instance of that class, in persistent state.
Cat fritz = (Cat) sess.get(Cat.class, generatedId); // Another option is load() Cat fritz = (Cat) sess.load(Cat.class, generatedId);
load() returns an uninitialized proxy and does not actually hit the triple-store until you invoke a method of the proxy. This behavior is very useful if you wish to create an association to an object without actually loading it from the triple-store.
If you are not certain that a matching instance exists, you should use the get() method, which hits the triple-store immediately and returns null if there is no matching instance. (Note: This is only true for entities with a configured rdf:type. If there is no rdf:type value, an entity instance is always created.)
Cat cat = (Cat) sess.get(Cat.class, id); if (cat==null) { cat = new Cat(); sess.saveOrupdate(cat, id); } return cat;
It is possible to re-load an object and all its collections at any time, using the refresh() method. This is useful when inverse mapped property updates are used to initialize some of the properties of the object.
cat.setMother(mother); sess.saveOrUpdate(cat); sess.flush(); //force the TQL INSERT sess.refresh(mother); //re-read the state so that the kittens list is updated
An important question usually appears at this point: How much does Topaz load from the database and how many TQL SELECTs will it use?
Currently each object loaded by Topaz requires two queries:
- one for the forward mapped properties
- and the other for the reverse mapped properties
If you don't know the identifiers of the objects you are looking for, you need a query. Topaz supports an easy-to-use but powerful object oriented query language (OQL). You may also express your query in the native TQL of your database.
Executing queries
OQL and native TQL queries are represented with an instance of org.topazproject.otm.Query. This interface offers methods for parameter binding, result set handling, and for the execution of the actual query. You always obtain a Query using the current Session:
Results reasults = session.createQuery("select cat from Cat cat where cat.birthdate < :bd;"). setParameter("bd", date).execute(); List<Cat> cats = new ArrayList<Cat>(); while(results.next()) cats.add((Cat) results.get(0)); Results results = session.createQuery( "select cat.mother from Cat cat where cat.name = :name") .setParameter("name", name) .execute(); List<Cat> cats = new ArrayList<Cat>(); while(results.next()) cats.add((Cat) results.get(0)); Results results = session.createQuery( "select cat, cat.mother from Cat cat where cat.name = :name") .setParameter("name", name) .execute(); while(results.next()) { Cat cat = (Cat)results.get(0)); Cat mother = (Cat)results.get(1)); ... ... }
A query is executed using the execute() method. The actual entity instances returned by the query may already be in the session or second-level cache. If they are not already cached, iterating will be slower and might require many database hits for a simple query, usually 1 for the initial select which only returns identifiers, and n additional selects to initialize the actual instances.
Bind parameters
Methods on Query are provided for binding values to named parameters. Named parameters are identifiers of the form :name in the query string.
what parameters give you are three things:
- It can make for a more readable query string
- It can make for a reusable query string
- It does escaping and quoting for you (e.g. if you have a literal it will escape the quotes before inserting it into the query), and if you use the setParameter(String, Object) method it figures out whether it should be a uri or a literal and also serializes the value for you (e.g. a Date object is will get turned into a properly formatted date string).
So, we recommend using the parameters as they provide more checking (e.g. you'll get warnings if use setUri() where a literal is expected) and automatic conversions from (java) objects to their appropriate lexical representation.
// setParameter is preffered
Query q = sess.createQuery("selcect cat from DomesticCat cat where cat.name = :name;");
q.setParameter("name", "Fritz");
Iterator cats = q.iterate();
// setPlainLiteral for un-typed literal. The value must be serialized
Query q = sess.createQuery("select cat from DomesticCat cat where cat.name = :name;");
q.setPlainLiteral(name, "Izi", null);
// setTypedLiteral for typed literal. The value must be serialized
Query q = sess.createQuery("select cat from DomesticCat cat where cat.birthDate > :date;");
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
q.setTypedLiteral(date, format.format(date), null);
// setUri for resource nodes
Query q = sess.createQuery("select cat from DomesticCat cat where cat.mother = :mother;");
q.setUri(date, URI.create("http://cats.com/izi"));
Modifying persistent objects
Transactional persistent instances (ie. objects loaded, saved, created or queried by the Session) may be manipulated by the application and any changes to persistent state will be persisted when the Session is flushed (discussed later in this chapter). There is no need to call a particular method (like saveOrupdate(), which has a different purpose) to make your modifications persistent. So the most straightforward way to update the state of an object is to get() it, and then manipulate it directly, while the Session is open:
DomesticCat? cat = sess.get( Cat.class, "cats:23"); cat.setName("PK"); sess.flush(); // changes to cat are automatically detected and persisted
Sometimes this programming model is inefficient since it would require both an TQL SELECT (to load an object) and an TQL DELETE and TQL INSERT (to persist its updated state) in the same session. Therefore Topaz offers an alternate approach, using detached instances.
Note that Topaz does not offer its own API for direct execution of INSERT or DELETE statements. Topaz is a state management service, you don't have to think in statements to use it. Furthermore, the notion of mass operations conflicts with object/triple mapping for online transaction processing-oriented applications. Future versions of Topaz may however provide special mass operation functions.
Modifying detached objects
Many applications need to retrieve an object in one transaction, send it to the UI layer for manipulation, then save the changes in a new transaction. Applications that use this kind of approach in a high-concurrency environment usually use versioned data to ensure isolation for the "long" unit of work.
Topaz supports this model by providing for reattachment of detached instances using the Session.saveOrupdate() or Session.merge() methods:
// in the first session Cat cat = (Cat) firstSession.load(Cat.class, catId); Cat potentialMate = new Cat(); firstSession.saveOrUpdate(potentialMate); // in a higher layer of the application cat.setMate(potentialMate); // later, in a new session secondSession.saveOrupdate(cat); // update cat secondSession.saveOrupdate(mate); // update mate
If the Cat with identifier catId had already been loaded by secondSession when the application tried to reattach it, an exception would have been thrown.
Use saveOrupdate() if you are sure that the session does not contain an already persistent instance with the same identifier, and merge() if you want to merge your modifications at any time without consideration of the state of the session.
In other words, saveOrupdate() is usually the first method you would call in a fresh session, ensuring that reattachment of your detached instances is the first operation that is executed.
The application should individually saveOrupdate() detached instances reachable from the given detached instance if and only if it wants their state also updated. This can be automated of course, using transitive persistence, see “Transitive persistence”.
The usage and semantics of saveOrUpdate() seems to be confusing for new users. Firstly, so long as you are not trying to use instances from one session in another new session, you should not need to use saveOrUpdate(), or merge(). Some whole applications will never use either of these methods.
Usually saveOrUpdate() is used in the following scenario:
- the application loads an object in the first session
- the object is passed up to the UI tier
- some modifications are made to the object
- the object is passed back down to the business logic tier
- the application persists these modifications by calling update() in a second session
saveOrUpdate() does the following:
- if the object is already persistent in this session, do nothing
- if another object associated with the session has the same identifier, throw an exception
- if the object has no identifier property, save it
- if the object's identifier has the value assigned to a newly instantiated object, save it
- otherwise update the object
and merge() is very different:
- if there is a persistent instance with the same identifier currently associated with the session, copy the state of the given object onto the persistent instance
- if there is no persistent instance currently associated with the session, try to load it from the database, or create a new persistent instance
- the persistent instance is returned
- the given instance does not become associated with the session, it remains detached
Deleting persistent objects
Session.delete() will remove an object's state from the database. Of course, your application might still hold a reference to a deleted object. It's best to think of delete() as making a persistent instance transient.
sess.delete(cat);
You may delete objects in any order you like. It is still possible to void a delete by deleting objects in the wrong order, e.g. if you delete the parent, but forget to delete the children.
Flushing the Session
From time to time the Session will execute the TQL statements needed to synchronize the triple-sotre's state with the state of objects held in memory. This process, flush, occurs by default at the following points
- before some query executions
- from org.topazproject.otm.Transaction.commit()
- from Session.flush()
Except when you explicitly flush(), there are absolutely no guarantees about when the Session writes out the changes. However, Topaz does guarantee that the Query.execute(..) will never return stale data; nor will they return the wrong data.
It is possible to change the default behavior so that flush occurs less frequently. The FlushMode? class defines three different modes: only flush at commit time (and only when the Topaz Transaction API is used), flush automatically using the explained routine, or never flush unless flush() is called explicitly. The last mode is useful for long running units of work, where a Session is kept open and disconnected for a long time.
sess = sf.openSession(); Transaction tx = sess.beginTransaction(); sess.setFlushMode(FlushMode.COMMIT); // allow queries to return stale state Cat izi = (Cat) sess.load(Cat.class, id); izi.setName("izi"); // might return stale data sess.createQuery("select cat from Cat cat where cat.name = 'iznizi';").execute(); // change to izi is not flushed! ... tx.commit(); // flush occurs sess.close();
During flush, an exception might occur. Since handling exceptions involves some understanding of Topaz's transactional behavior, we discuss it in Transactions And Concurrency?.
Transitive persistence
It is quite cumbersome to save, delete, or reattach individual objects, especially if you deal with a graph of associated objects. A common case is a parent/child relationship. Consider the following example:
If the children in a parent/child relationship would be value typed (e.g. a collection of strings), their life cycle would depend on the parent and no further action would be required for convenient "cascading" of state changes. When the parent is saved, the value-typed child objects are saved as well, when the parent is deleted, the children will be deleted, etc. This even works for operations such as the removal of a child from the collection; Topaz will detect this and, since value-typed objects can't have shared references, delete the child from the database.
Now consider the same scenario with parent and child objects being entities, not value-types (e.g. categories and items, or parent and child cats). Entities have their own life cycle, support shared references (so removing an entity from the collection does not mean it can be deleted). Topaz considers these entities as peer entities by default and implements persistence by reachability by default (unlike Hibernate). The peer operations include cascading of all operations on Topaz except the ones that deletes the peer.
For each basic operation of the Topaz session - including merge(), saveOrUpdate(), delete(), refresh(), evict() - there is a corresponding cascade style. Respectively, the cascade styles are named merge, save-update, delete, refresh, evict.
| CascadeType.merge | Cascades session.merge() to the association |
| CascadeType.saveOrUpdate | Cascades session.saveOrUpdate() to the association. Also flush() will transitively cascade to associations marked with this CascadeType. |
| CascadeType.refresh | Cascades session.refresh() to the association |
| CascadeType.evict | Cascades session.evict() to the association |
| CascadeType.peer | This is the default. It implies merge, saveOrUpdate, refresh and evict CascadeTypes See Peer below |
| CascadeType.delete | Cascades session.delete() to the association. This includes cascading of deletes during flush() when the parent is orphaned and deleted. |
| CascadeType.deleteOrphan | During a flush(), deletes the previously associated entity instance when it is no longer associated with a parent entity instance. See Delete Orphan below. |
| CascadeType.child | Implies peer, delete, deleteOrphan CascadeTypes. See Child below |
If you want an operation to be cascaded along an association, you must indicate that. For POJO mapping, this can be achieved by configuring the @Predicate annotation :
@Predicate(cascade={CascadeType.saveOrUpdate})
...
Cascade styles my be combined:
@Predicate(cascade={CascadeType.saveOrUpdate, CascadeType.merge, CascadeType.refresh, CascaseType.evict})
...
You may even use cascade= "child" to specify that all operations should be cascaded along the association. The default cascade="peer" specifies that all operations that does not violate a sharable refernce from other entity instances can be cascaded.
Delete Orphan
A special cascade style, "delete-orphan", applies only to one-to-many associations, and indicates that the delete() operation should be applied to any child object that is removed from the association.
For example, removing discontinued products from a product catalogue is sufficient to remove the products from the database if the products list is marked as "delete-orphan".
ProductCatalogue catalogue = session.get(ProductCatalogue.class, id);
...
...
catalogue.getProducts().removeAll(discontinuedProducts);
...
...
session.getTransaction().commit();
Note that this can work for scalar fields also. Setting the scalar field to null or replacing the scalar field value with a different value is sufficient to delete the old associated object.
Person person = session.get(Person.class, ...);
...
...
Address oldAddress = person.getAddress();
person.setAddress(newAddress);
...
...
session.getTransaction().commit();
In the above, the oldAddress will be deleted during the flush() call prior to commit if the delete-orphan cascade option is set for address property.
Child
Mapping an association (either a single valued association, or a collection) with cascade="child" marks the association as a parent/child style relationship where save/update/delete of the parent results in save/update/delete of the child or children.
Futhermore, a mere reference to a child from a persistent parent will result in save/update of the child. The precise semantics of cascading operations for a parent/child relationship are as follows:
- If a parent is passed to saveOrUpdate(), all children are passed to saveOrUpdate()
- If a parent is passed to merge(), all children are passed to merge()
- If a transient or detached child becomes referenced by a persistent parent, it is passed to saveOrUpdate()
- If a parent is deleted, all children are passed to delete()
- If a child is dereferenced by a persistent parent, the "orphaned" child is deleted.
- If a parent is passed to evict(), all children are passed to evict()
Peer
The precise semantics of cascading operations for a peer relationship is as follows:
- If an object is passed to saveOrUpdate(), all peer objects are passed to saveOrUpdate()
- If an object is passed to merge(), all peer objects are passed to merge()
- If a transient or detached object becomes referenced as a peer by a persistent object, it is passed to saveOrUpdate()
- If an object is deleted, none of the peer objects are passed to delete() - the application should explicitly delete the peer if necessary.
- If a peer is dereferenced by a persistent object, nothing special happens - the application should explicitly delete the peer if necessary - unless cascade="delete-orphan", in which case the "orphaned" peer is deleted.
- If an object is passed to evict(), all peer objects are passed to evict() too. However if the peer is referenced from other objects (shared reference), transitive 'saveOrUpdate'/'delete' of those other referencing objects during flush() will cause the peer to be re-attached and saved/deleted.
Recommendations
- If the child object's lifespan is bounded by the lifespan of the parent object, make it a life cycle object by specifying cascade="child".
- Otherwise, you might not need cascade at all. But if you think that you will often be working with the associated objects together in the same transaction, and you want to save yourself some typing, consider using cascade="peer".
Important Note
Finally, note that cascading of operations can be applied to an object graph at call time or at flush time. All operations, if enabled, are cascaded to associated entities reachable when the operation is executed. However, saveOrUpdate and delete-orphan are transitive for all associated entities reachable during flush of the Session.
Using metadata
Topaz requires a very rich meta-level model of all entity and value types. From time to time, this model is very useful to the application itself. For example, the application might use Topaz's metadata to implement a "smart" deep-copy algorithm that understands which objects should be copied (eg. mutable value types) and which should not (eg. immutable value types and, possibly, associated entities).
Topaz exposes metadata via the ClassMetadata? interface. Instances of the metadata interfaces may be obtained from the SessionFactory?.
Cat fritz = ......; ClassMetadata cm = sessionfactory.getClassMetadata(Cat.class); for (RdfMapper m : cm.getRdfMappers()) { List<String> nv; PropertyBinder b = m.getBinder(sess); RawFieldData data = b.getRawFieldData(instance); if (data != null) { // Data is yet to be loaded on the field. So get it from here nv = data.getValues(); } else if (!m.isAssociation()) { // Data is loaded on the field. Get it from the field, nv = b.get(fritz); } else { // associations that are loaded on the field // we'll jut get the ids of those objects nv = session.getIds(b.get(fritz)); } // do something with the serialized values .... // for raw values b.getRawValue(fritz) is the API to use } BlobMapper blobField = cm.getBlobField()); if (blobField != null) { PropertyBinder binder = blobField.getBinder(sess); Streamer streamer = binder.getStreamer(); if (!streamer.isManaged()) blob = copy(streamer.getBytes(binder, fritz)); }
