Representation of a Kudu client session. More...

#include <client.h>

Inheritance diagram for kudu::client::KuduSession:

Public Types
enum	FlushMode { AUTO_FLUSH_SYNC, AUTO_FLUSH_BACKGROUND, MANUAL_FLUSH }
	Modes of flush operations. More...

enum	ExternalConsistencyMode { CLIENT_PROPAGATED, COMMIT_WAIT }
	The possible external consistency modes on which Kudu operates. More...

Public Member Functions
Status	SetFlushMode (FlushMode m) WARN_UNUSED_RESULT

Status	SetExternalConsistencyMode (ExternalConsistencyMode m) WARN_UNUSED_RESULT

Status	SetMutationBufferSpace (size_t size_bytes) WARN_UNUSED_RESULT

void	SetTimeoutMillis (int millis)

Status	Apply (KuduWriteOperation *write_op) WARN_UNUSED_RESULT

void	ApplyAsync (KuduWriteOperation write_op, KuduStatusCallback cb)

Status	Flush () WARN_UNUSED_RESULT

void	FlushAsync (KuduStatusCallback *cb)

Status	Close () WARN_UNUSED_RESULT

bool	HasPendingOperations () const

int	CountBufferedOperations () const

int	CountPendingErrors () const

void	GetPendingErrors (std::vector< KuduError * > errors, bool overflowed)

KuduClient *	client () const

Friends
class	KuduClient

class	internal::Batcher

Detailed Description

Representation of a Kudu client session.

A KuduSession belongs to a specific KuduClient, and represents a context in which all read/write data access should take place. Within a session, multiple operations may be accumulated and batched together for better efficiency. Settings like timeouts, priorities, and trace IDs are also set per session.

A KuduSession's main purpose is for grouping together multiple data-access operations together into batches or transactions. It is important to note the distinction between these two:

A batch is a set of operations which are grouped together in order to amortize fixed costs such as RPC call overhead and round trip times. A batch DOES NOT imply any ACID-like guarantees. Within a batch, some operations may succeed while others fail, and concurrent readers may see partial results. If the client crashes mid-batch, it is possible that some of the operations will be made durable while others were lost.
In contrast, a transaction is a set of operations which are treated as an indivisible semantic unit, per the usual definitions of database transactions and isolation levels.

Note: Kudu does not currently support transactions! They are only mentioned in the above documentation to clarify that batches are not transactional and should only be used for efficiency.

KuduSession is separate from KuduTable because a given batch or transaction may span multiple tables. This is particularly important in the future when we add ACID support, but even in the context of batching, we may be able to coalesce writes to different tables hosted on the same server into the same RPC.

KuduSession is separate from KuduClient because, in a multi-threaded application, different threads may need to concurrently execute transactions. Similar to a JDBC "session", transaction boundaries will be delineated on a per-session basis – in between a "BeginTransaction" and "Commit" call on a given session, all operations will be part of the same transaction. Meanwhile another concurrent Session object can safely run non-transactional work or other transactions without interfering.

Additionally, there is a guarantee that writes from different sessions do not get batched together into the same RPCs – this means that latency-sensitive clients can run through the same KuduClient object as throughput-oriented clients, perhaps by setting the latency-sensitive session's timeouts low and priorities high. Without the separation of batches, a latency-sensitive single-row insert might get batched along with 10MB worth of inserts from the batch writer, thus delaying the response significantly.

Though we currently do not have transactional support, users will be forced to use a KuduSession to instantiate reads as well as writes. This will make it more straight-forward to add RW transactions in the future without significant modifications to the API.

Users who are familiar with the Hibernate ORM framework should find this concept of a Session familiar.

Note: This class is not thread-safe except where otherwise specified.

Member Enumeration Documentation

enum kudu::client::KuduSession::ExternalConsistencyMode

The possible external consistency modes on which Kudu operates.

Enumerator

CLIENT_PROPAGATED

The response to any write will contain a timestamp. Any further calls from the same client to other servers will update those servers with that timestamp. Following write operations from the same client will be assigned timestamps that are strictly higher, enforcing external consistency without having to wait or incur any latency penalties.

In order to maintain external consistency for writes between two different clients in this mode, the user must forward the timestamp from the first client to the second by using KuduClient::GetLatestObservedTimestamp() and KuduClient::SetLatestObservedTimestamp().

This is the default external consistency mode.

Warning: Failure to propagate timestamp information through back-channels between two different clients will negate any external consistency guarantee under this mode.

COMMIT_WAIT

The server will guarantee that write operations from the same or from other client are externally consistent, without the need to propagate timestamps across clients. This is done by making write operations wait until there is certainty that all follow up write operations (operations that start after the previous one finishes) will be assigned a timestamp that is strictly higher, enforcing external consistency.

Warning: Depending on the clock synchronization state of TabletServers this may imply considerable latency. Moreover operations in COMMIT_WAIT external consistency mode will outright fail if TabletServer clocks are either unsynchronized or synchronized but with a maximum error which surpasses a pre-configured threshold.

enum kudu::client::KuduSession::FlushMode

Modes of flush operations.

Enumerator

AUTO_FLUSH_SYNC

Every write will be sent to the server in-band with the Apply() call. No batching will occur. In this mode, the Flush() call never has any effect, since each Apply() call has already flushed the buffer. This is the default flush mode.

AUTO_FLUSH_BACKGROUND

Apply() calls will return immediately, but the writes will be sent in the background, potentially batched together with other writes from the same session. If there is not sufficient buffer space, then Apply() will block for buffer space to be available.

Because writes are applied in the background, any errors will be stored in a session-local buffer. Call CountPendingErrors() or GetPendingErrors() to retrieve them.

The Flush() call can be used to block until the buffer is empty.

Warning: This is not implemented yet, see KUDU-456

Todo:: Provide an API for the user to specify a callback to do their own error reporting.

Todo:: Specify which threads the background activity runs on (probably the messenger IO threads?).

MANUAL_FLUSH

Apply() calls will return immediately, and the writes will not be sent until the user calls Flush(). If the buffer runs past the configured space limit, then Apply() will return an error.

Member Function Documentation

Status kudu::client::KuduSession::Apply ( KuduWriteOperation * write_op )

Todo:: Add "doAs" ability here for proxy servers to be able to act on behalf of other users, assuming access rights.

Apply the write operation.

The behavior of this function depends on the current flush mode. Regardless of flush mode, however, Apply() may begin to perform processing in the background for the call (e.g. looking up the tablet, etc). Given that, an error may be queued into the PendingErrors structure prior to flushing, even in MANUAL_FLUSH mode.

In case of any error, which may occur during flushing or because the write_op is malformed, the write_op is stored in the session's error collector which may be retrieved at any time.

Note: This method is thread safe.

Parameters

[in] write_op Operation to apply. This method transfers the write_op's ownership to the KuduSession.

Returns: Operation result status.

void kudu::client::KuduSession::ApplyAsync	(	KuduWriteOperation *	write_op,
		KuduStatusCallback *	cb
	)

Apply the write operation asynchronously.

This method is similar to Apply(), except it never blocks. Even in the flush modes that return immediately, cb is triggered with the result. The callback may be called by a reactor thread, or in some cases may be called inline by the same thread which calls ApplyAsync().

Parameters

[in]	write_op	Operation to apply. This method transfers the write_op's ownership to the KuduSession.
[in]	cb	Callback to report the status of the operation. The `cb` object must remain valid until it is called.

Warning: Not yet implemented.

KuduClient* kudu::client::KuduSession::client ( ) const

Returns: Client for the session: pointer to the associated client object.

Status kudu::client::KuduSession::Close ( )

Returns: Status of the session closure. In particular, an error is returned if there are unflushed or in-flight operations.

int kudu::client::KuduSession::CountBufferedOperations ( ) const

Get number of buffered operations (not the same as 'pending').

Note that this is different than HasPendingOperations() above, which includes operations which have been sent and not yet responded to. This is only relevant in MANUAL_FLUSH mode, where the result will not decrease except for after a manual flush, after which point it will be 0. In the other flush modes, data is immediately put en-route to the destination, so this will return 0.

Note: This function is thread-safe.

Returns: The number of buffered operations. These are operations that have not yet been flushed – i.e. they are not en-route yet.

int kudu::client::KuduSession::CountPendingErrors ( ) const

Get error count for pending operations.

Errors may accumulate in session's lifetime; use this method to see how many errors happened since last call of GetPendingErrors() method.

Note: This function is thread-safe.

Returns: Total count of errors accumulated during the session.

Status kudu::client::KuduSession::Flush ( )

Flush any pending writes.

In AUTO_FLUSH_SYNC mode, this has no effect, since every Apply() call flushes itself inline.

Note: This function is thread-safe.

Returns: Operation result status. In particular, returns a non-OK status if there are any pending errors after the rows have been flushed. Callers should then use GetPendingErrors to determine which specific operations failed.

void kudu::client::KuduSession::FlushAsync ( KuduStatusCallback * cb )

Flush any pending writes asynchronously.

This method schedules a background flush of pending operations. Provided callback is invoked upon completion of the flush. If there were errors while flushing the operations, corresponding 'not OK' status is passed as a parameter for the callback invocation. Callers should then use GetPendingErrors() to determine which specific operations failed.

Parameters

[in] cb Callback to call upon flush completion. The cb must remain valid until it is invoked.

In the case that the async version of this method is used, then the callback will be called upon completion of the operations which were buffered since the last flush. In other words, in the following sequence:

session->Insert(a);
session->FlushAsync(callback_1);
session->Insert(b);
session->FlushAsync(callback_2);

... callback_2 will be triggered once b has been inserted, regardless of whether a has completed or not.

Note: This also means that, if FlushAsync is called twice in succession, with no intervening operations, the second flush will return immediately. For example:
session->Insert(a);

session->FlushAsync(callback_1); // called when 'a' is inserted

session->FlushAsync(callback_2); // called immediately!

Note that, as in all other async functions in Kudu, the callback may be called either from an IO thread or the same thread which calls FlushAsync. The callback should not block.

void kudu::client::KuduSession::GetPendingErrors	(	std::vector< KuduError * > *	errors,
		bool *	overflowed
	)

Get information on errors from previous session activity.

The information on errors are reset upon calling this method.

Parameters

[out]	errors	Pointer to the container to fill with error info objects. Caller takes ownership of the returned errors in the container.
[out]	overflowed	If there were more errors than could be held in the session's error storage, then `overflowed` is set to `true`.

Note: This function is thread-safe.

bool kudu::client::KuduSession::HasPendingOperations ( ) const

Check if there are any pending operations in this session.

Note: This function is thread-safe.

Returns: true if there are operations which have not yet been delivered to the cluster. This may include buffered operations (i.e. those that have not yet been flushed) as well as in-flight operations (i.e. those that are in the process of being sent to the servers).

Todo:: Maybe "incomplete" or "undelivered" is clearer?

Status kudu::client::KuduSession::SetExternalConsistencyMode ( ExternalConsistencyMode m )

Set external consistency mode for the session.

Parameters

[in] m External consistency mode to set.

Returns: Operation result status.

Status kudu::client::KuduSession::SetFlushMode ( FlushMode m )

Set the flush mode.

Precondition: There should be no pending writes – call Flush() first to ensure nothing is pending.

Parameters

[in] m Flush mode to set.

Returns: Operation status.

Status kudu::client::KuduSession::SetMutationBufferSpace ( size_t size_bytes )

Set the amount of buffer space used by this session for outbound writes.

The effect of the buffer size varies based on the flush mode of the session:

AUTO_FLUSH_SYNC since no buffering is done, this has no effect.
AUTO_FLUSH_BACKGROUND if the buffer space is exhausted, then write calls will block until there is space available in the buffer.
MANUAL_FLUSH if the buffer space is exhausted, then write calls will return an error

Parameters

[in] size_bytes Size of the buffer space to set (number of bytes).

Returns: Operation result status.

void kudu::client::KuduSession::SetTimeoutMillis ( int millis )

Set the timeout for writes made in this session.

Parameters

[in] millis Timeout to set in milliseconds; should be greater than 0.

The documentation for this class was generated from the following file:

include/kudu/client/client.h

Public Types

Public Member Functions

Friends

Detailed Description

Member Enumeration Documentation

Member Function Documentation