While certain embodiments of the inventions have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the inventions.
A distributed system configured to resolve an atomic transaction comprising multiple transactions among a set of parties within the distributed system, the distributed system comprising: a plurality of participants, each of the plurality of participants residing on a node of a computer system and in communication with each of the other of the plurality of participants; and. The distributed system of claim 1 , wherein the coordinator is further configured to decide to: commit the atomic transaction if prepared messages have been received from each of the plurality of participants and each received prepared message indicates that the sending participant can commit a transaction of the atomic transaction; and.
The distributed system of claim 1 , wherein each of the plurality of participants is further configured to decide to: commit the transaction if a decision to commit the atomic transaction is received from the coordinator; and. The distributed system of claim 2 , wherein the coordinator is further configured to abort the transaction if communication between the coordinator and one of the plurality of participants is unavailable.
The distributed system of claim 1 , wherein each of the plurality of participants is further configured to commit the transaction if all of the received one or more decision messages indicate that the sending participant has committed a transaction of the atomic transaction.
The distributed system of claim 1 , wherein each of the plurality of participants is further configured to abort the transaction if one of the received one or more decision messages indicate that the sending participant has aborted a transaction of the atomic transaction. The distributed system of claim 1 , further comprising a shared participant residing on the same node of the computer system as the coordinator and in communication with each of the plurality of participants, the shared participant configured to: receive a prepare message from an initiator, the prepare message indicating that the shared participant should commit a transaction of the atomic transaction;.
The method of claim 1 , wherein each of the plurality of participants is further configured to: determine that communication with one of the plurality of participants was previously unavailable and has become available; and. The method of claim 8 , wherein said determination that communication with one of the plurality of participants was previously unavailable and has become available is based on a message received from said previously unavailable one of the plurality of participants.
The method of claim 1 , wherein each of the plurality of participants is further configured to: receive a request message from one or more of the plurality of participants, the request message requesting the participant's decision to either commit or abort the transaction; and. A method of resolving an atomic transaction comprising multiple transactions among a plurality of participants, each participant residing on a node of a distributed computer system, the method comprising: sending, by the one or more computer processors of each of the plurality of participants, a prepared message to a coordinator, the coordinator residing on the same node as at least one of the plurality of participants, the prepared message indicating that the sending participant is prepared to commit a transaction of the atomic transaction;.
The method of claim 11 , further comprising receiving, by each of the plurality of participants, a prepare message from an initiator, the prepare message indicating that the receiving participant should prepare for the transaction. The method of claim 11 , wherein receiving a decision from the coordinator comprises receiving a commit message from the coordinator node to commit the transaction. The method of claim 11 , wherein receiving a decision from the coordinator comprises receiving an abort message from the coordinator node to abort the transaction.
USP true USB2 en. USB1 en. Systems and methods of managing resource utilization on a threaded computer system. Non-blocking processing of federated transactions for distributed data partitions. Proactive state change acceptability verification in journal-based storage systems. Systems and methods for providing a distributed file system incorporating a virtual hot spare. System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system.
EPA1 en. Systems and methods for using excitement values to predict future access to resources. Systems and methods for managing concurrent access requests to a shared resource. Maintaining global state of distributed transaction managed by an external transaction manager for clustered database systems. Distributed transaction processing through commit messages sent to a downstream neighbor.
Systems and methods for prepare list communication to participants in two-phase commit protocol transaction processing. Systems and methods for semi-durable transaction log storage in two-phase commit protocol transaction processing.
JPB2 en. System and method for transaction recovery in a multi-tenant application server environment. Method and architecture for providing database access control in a network with a distributed database system.
Dynamically optimizing flows in a distributed transaction processing environment. USA en. Method for balancing of distributed tree file structures in parallel computing systems to enable recovery after a failure.
System for distributed computation processing includes dynamic assignment of predicates to define interdependencies. Detection and resolution of resource deadlocks in a distributed data processing system. Method for determining command execution dependencies within command queue reordering process.
Method for partitioning disk drives within a physical disk array and selectively assigning disk drive partitions into a logical disk array. System for managing topology of a network in spanning tree data structure by maintaining link table and parent table in each network node. Dynamic hashing method for optimal distribution of locks within a clustered system. EPA2 en. Variable cyclic redundancy coding method and apparatus for use in a multistage network.
Disk array system having adjustable parity group sizes based on storage unit capacities. Method, product, and structure for flexible range locking of read and write requests using shared and exclusive locks, flags, sub-locks, and counters. Data system with distributed tree indexes and method for maintaining the indexes. Method and apparatus for serializing resource access requests in a multisystem complex. Constructing a transaction serialization order based on parallel or distributed database log files.
Apparatus and method for sharing data and routing messages between a plurality of workstations in a local area network. RAID controller system utilizing front end and back end caching systems including communication path connecting two caching systems and synchronizing allocation of blocks in caching systems.
System and method for re-striping a set of objects onto an exploded array of storage units in a computer system. Method for managing a plurality of data processes residing in heterogeneous data repositories. Single transaction technique for a journaling file system of a computer operating system.
Decentralized file mapping in a striped network file system in a distributed computing environment. Method for allocating files in a file system integrated with a RAID disk sub-system. Method for organizing storage devices of unequal storage capacity and distributing data using different raid formats depending on size of rectangles containing sets of the storage devices.
File wrapper containing cataloging information for content searching across multiple platforms. Admission control where priority indicator is used to discriminate between messages.
Memory space management method, data transfer method, and computer device for distributed computer system. Method and system for data recovery using a distributed and scalable data structure. Method for allocating files in a file system integrated with a raid disk sub-system. System and method for incremental change synchronization between multiple copies of data. Method and apparatus for the on-line reconfiguration of the logical volumes of a data storage system.
File array storage architecture having file system distributed across a data processing platform. USA1 en. Method, apparatus and system for maintaining connections between computers using connection-oriented protocols.
Automatic aggregation method, automatic aggregation apparatus, and recording medium having automatic aggregation program. Information storage system for redistributing information to information storage devices when a structure of the information storage devices is changed. Disk array apparatus, error control method for the same apparatus, and control program for the same method.
Method and apparatus for identifying changes to a logical object based on changes to the logical object at physical level. Method and apparatus for network interface card load balancing and port aggregation.
Method and system for automatically updating the version of a set of files stored on content servers. System and method for displaying and selling goods and services in a retail environment employing electronic shopper aids. Information collection server, information collection method, and recording medium. Method and apparatus for discovering computer systems in a distributed multi-system cluster. Method and apparatus for managing a plurality of servers in a content delivery network.
Automatic work progress tracking and optimizing engine for a telecommunications customer care and billing system. System for backing up files from disk volumes on multiple nodes of a computer network. System and method for storing and retrieving filenames and files in computer memory using multiple encodings. Optimizing computer performance by using data compression principles to minimize a loss function.
Method and apparatus for providing a host computer with information relating to the mapping of logical volumes within an intelligent storage system. Graphical user interfaces for network management automated provisioning environment. Method for ensuring operation during node failures and network partitions in a clustered message passing server.
Method and apparatus for fast distributed restoration of a communication network. Method and apparatus to prefetch sequential pages in a multi-stream environment. Correcting multiple block data loss in a storage array using a combination of a single diagonal parity group and multiple row parity groups. Managing a snapshot volume or one or more checkpoint volumes with multiple point-in-time images in a single repository. Method, system, program, and data structures for mapping logical units to a storage space comprises of at least one array of storage units.
Methods and apparatus for implementing virtualization of storage within a storage area network. System and method for asynchronous mirroring of snapshots at a destination using a purgatory directory and inode mapping.
Electronic processor providing direct data transfer between linked data consuming instructions. Method and system for executing and undoing distributed server change operations. Method and apparatus for handling failures of resource managers in a clustered environment. Method and system for logical-object-to-physical-location translation and physical separation of logical objects.
Method and apparatus for acquiring media services available from content aggregators. System and method for managing data flow and measuring service in a storage network. System and method for efficiently replicating a file among a plurality of recipients in a reliable manner. Data copy-protecting system for creating a copy-secured optical disc and corresponding protecting method.
Prefetch command control method, prefetch command control apparatus and cache memory control apparatus. Transaction processing system supporting concurrent accesses to hierarchical data by transactions.
Method of improving the availability of a computer clustering system through the use of a network medium link state function. Multi-threaded write interface and methods for increasing the single file read and write throughput of a file server. Systems and methods for providing automated diagnostic services for a cluster computer system. Threshold-based load address prediction and new thread identification in a multithreaded microprocessor.
Data streaming and backup systems having multiple concurrent read threads for improved small file performance. Method for managing file using network structure, operation object display limiting program, and recording medium.
Apparatus, system, and method for isolating a storage application from a network interface driver. Method and system for covering multiple resourcces with a single credit in a computer system. Storage backup system for backing up data written to a primary storage device to multiple virtual mirrors using a reconciliation process that reflects the changing state of the primary storage device over time.
System and method for database replication by interception of in memory transactional change records. File management device, file management method, file management program and recording medium.
Method and apparatus for determining the order of execution of queued commands in a data storage system. Method for controlling a media message upload through a wireless communication network. Facilitating event notification through use of an inverse mapping structure for subset determination. Method and apparatus for on demand multicast and unicast using controlled flood multicast communications.
System and method for recording the order of a change caused by restoring a primary volume during ongoing replication of the primary volume. Method and apparatus for differential, bandwidth-efficient and storage-efficient backups. Monitoring latency of a network to manage termination of distributed transactions. Network system using name server with pseudo host name and pseudo IP address generation function.
Memory management for virtual address space with translation units of variable range size. Hierarchical storage management using dynamic tables of contents and sets of tables of contents. Storage system creating a recovery request point enabling execution of a recovery. Apparatus and method for group session key and establishment using a certified migration key. Programmable streaming data processor for database appliance having multiple processing unit groups. System and method for verifying and restoring the consistency of inode to pathname mappings in a filesystem.
Memory management during processing of binary decision diagrams in a computer system. Scalable communication within a distributed system using dynamic communication trees.
Use of dynamic multi-level hash table for managing hierarchically structured information. Structured document signature apparatus, structured document adaptation apparatus, and structured document verification apparatus. System and method for a user-configurable, removable media-based, multi-package installer. Data path accelerator with variable parity, variable length, and variable extent parity groups. Systems and methods providing metadata for tracking of information on a distributed file system of storage devices.
Systems and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system. System and method for providing metadata for tracking information on a distributed file system comprising storage devices.
JPA en. EPB1 en. Arthur S. Rose, Letter to Steven M. Bauer, Aug. Bauer, Feb. Bauer, Jun. Bauer, Nov. Birk, Y. Bob Duzett, et al. Byteandswitch, " Discovery Chooses Isilon, " Apr. Coulouris et al. Darrell D. Long et al. Dorai et al. I-V, Gerhard Weikum, et al. Gibson, Garth A. IOS Commands, Feb. Keidar, et al. Kenneth P. Press, US. Levy E. Data Eng. The proposed coordinator log transaction execution protocol centralizes logging on a per-transaction basis and exploits piggybacking to provide the semantics of a distributed atomic commit without … Expand.
View 1 excerpt, references background. A Quorum-Based Commit Protocol. Berkeley Workshop. Coordinator log transaction execution protocol. Distributed and Parallel Databases. Nonblocking commit protocols. Computer Science, Business. Highly Influential. View 3 excerpts. Therefore, the final global state will contain both commit and abort states.
RULE For any intermediate local state s, if there exists a state t at other site in the sender set of s, such that t has a 3 Single Site Failure in a Multi-Site failure transition to a commit state then assign a transition labeled by timeoutltoc from s to a commit state. A Since the failure site may have sent out some outgoing site may fail after sending or receiving some of the messages messages before it fails, some sites may receive those mes- needed for a successful transition in the FSA.
We assume sages and assume that the failure site made its normal tran- that if a failure occurs before the transition from a state s sition. To ensure that every one has the same view of the to another state r , the FSA stays in state s.
We say that failed site, those sites who detect a timeout should prop- the failed site fails at state s. If a protocol has no local state with both commit and abort states in its concurrency set, like the three-phase com- RULE For any intermediate local state s, assign a tran- mit protocol shown in figure 2, then it can independently re- sition labeled by t o c l - from s to a commit state and a cover from an arbitrary single site failure.
For this, besides tmnsition labeled by toal- from state s to an abort state. To handle this situation, we toc messages arrives. The If no failure occurs, the number of messages required for resulting protocol of applying above rules t o three phase the protocol remains unchanged. In case of a failure, the commit protocol is shown in figure 3.
If less sites detect a timeout then less propagated rule , , and are sufficient to make the protocol messages will be sent. The maximum is N - 2 ' and the independently recover from any single site failure.
This overhead can be reduced by sending messages in some predefined order, or having one PROOF: Assume that there are N sites, numbered as special site broadcast all the messages. There are two cases t o be consid- 4 Multiple Site Failures ered. It is known that no independent recovery scheme can han- State s1 has a failure transition to a commit state. Since dle more than one site failure [7]. With multiple failures, site 1 fails before it sends out all outgoing messages, different failed sites may move t o different final states upon there exists at least one site which will detect a timeout, recovering.
The conflict occurs only when the union of con- send toc, and move to a commit state rule The currency sets of failed sites contains both commit and abort other operational sites either are in the commit state states. In this section, we present a recovery scheme, which already definition of the concurrency set or will receive can handle multiple site failures including total failures, by toc and move to a commit state rule Therefore, utilizing extra recovery states.
On recovery, a failed site all sites have a common final state commit. The vious transaction. In some situations, where a site fails in similar arguments can be used t o show that all sites some special state then it can independently recover to a fi- have a common final state abort.
The scheme we present identifies such situations, Site 1 Site i k 2 , 3, We first THEOREM If a commit protocol has no local state with present the scheme without any attempt for independent both commit and abort state in its concurrency set, then rule recovery and then show how, in some cases, independent , , and are suficient to make the protocol recovery can be performed. A failed site on recov- tocol with the FSAs in states x , and y,, respectively.
If C x i ery starts from state T , then sends recovery message to all contains a commit state and C y, contains an abort state, others and moves to state T W.
In state T W , upon receiving then under independent recovery, the system may reach an a commit or an abort message, it moves to state c or a, inconsistent state after recovering. For the sake of resolving respectively. Because all operational sites reach the same the conflict, we introduce two extra states: recovery state T find state, one commit or abort message is sufficient for and recovery waiting state T W. Upon recovering, a failed site failed sites to decide where to go.
If a failed site receives moves to the recovery state T , asks information from others, the unknown messages from all others, it means that all and waits for response in state T W. Upon receiving such sites failed during the transaction, and it can move to state requests, a site should respond commit if it has committed a. Others will receive the abort message from this site and the transaction, abort if it has aborted it, or unknown if move t o state a eventually.
Hence, any other node that fails with that C t contains a commit state, then assign a transition node i during the execution of a commit protocol must be in labeled by timeoutftoc from s to a commit state. Other- a state that belongs to C s;. In other words, if nodes i and wise assign a transition labeled by timeoutltoa from s to j fail concurrently in states s; and t j , then s; E C t j and an abort state.
Using this property, we can determine whether RULE For any intermediate local state s, assign a tmn- independent recovery is possible or not. Site i can t o identify the failure commit states and the failure abort perform independent recovery if and only if for each t, E states. For instance, in the three phase commit protocol, the C s; , C tj contains the same final state as C s ;.
According to the previous lemma and definition, we must contain a commit abort state also. PROOF:Since all states which can fail with si are in C s, , if their concurrency sets have the same final state as the RULEFor all failure abort states, assign a failure tmn- C s; , then s; can perform independent recovery to the final sition to an abort state.
For all failure commit states, assign state contained by C s,. By the definition of concurrency a failure transition to a commit state. For all other nonfinal set, some operational sites may already be in the final state, states, assign a failure transition to the recovery state r.
0コメント