QUICK LINKS

home

contact

-------

------

Grants ~ Partners

News: e-Science - Towards the Cloud: Infrastructures, Applications and Research Mar 07, 2013; Seminar: A Fresh Look at Parkinson's Disease Mar 07, 2013; NOMAD Funding Success Jan 28, 2013; ACAG Tutorial Session at HIPEAC 2012 Nov 29, 2011; ACAG Wins Times Higher Education Award Nov 29, 2011; More news…

COMMERCIAL ACTIVITY

cybula

Cybula Ltd provides a commercialisation route for ACAG research.

find out more

CORTEX-1

CORTEX-1 was the first implementation of a distributed neural processor. It was designed as a high-performance development system for AURA hardware, software, and applications. A major factor in the motivation and design of the system is to demonstrate the scalability of the AURA methods and technology, which is also a key aim for the AURA II project.

This system has now been retired and forms part of the James Austin Computer Collection.


			The system (shown left) comprises four main components: 7 networked PCs connected in a local cluster 28 Presence neural processor cards (each with 128 hardware neurons, each neuron with 10⁹ weights) Client-server software allowing the system to be used as a "compute farm" for neural networks and AURA applications. Sun-Grid Engine Each PC node in the cluster is equipped according to the following specification: Pentium II (700/500 MHz) 768Mb PC133 RAM storage 5 Presence-1 neural network accelerator cards 10Gb hard disk Red Hat 6.2 operating system.

In addition, four of the nodes are equipped with Dolphin Scali high-speed (80 MBytes/s) interconnections, and all nodes have 100Mbit ethernet. The system can be maintained from a single point using KVM switching and external access to the cluster is provided via a primary gateway node.

Cluster

Cortex-1 Physical Equipment Configuration

Details of the neural accelerator cards can be found on the Presence hardware page. The client-server system is a distributed form of the AURA library described elsewhere. This is currently designed around a stripped-down remote method invocation (RMI) model based on the Adaptive Communication Environment (ACE) from Hughes Networks. It uses a system of sockets for high speed communication to a client system lying outside the Cortex-1 machine from any part of the Cortex-1 machine, allowing the Cortex-1 machine to be configured as a flexible compute server for AURA enabled applications, for both software and hardware accelerated tasks. The software interface is designed to be virtually transparent to a programmer used to the standard

Actors

Cortex-1 Client-Server Relationships

Components

Cortex-1: Main Components of the Client-Server System

Further Developments

Future developments of the software are likely to include full implementation of futures, load balancing, and ultimately it is intended that the software should be GRID-enabled.

Presence is the current family of hardware designs which accelerate the core CMM computations needed in AURA applications. This section looks at how the functionality of PRESENCE has been seemlessly incorporated into the AURA library, and how multiple PRESENCE cards are used to scale up CMM size. Single PRESENCE recall performance is proportional to the CMM's output (separator) width, therefore by striping an output vector across multiple cards and executing simultaneous recalls in parallel, we can also scale performance.

We have identified two levels of scalable AURA:

Multiple PRESENCE cards in a machine (node);

Multiple nodes in a cluster.

PRESENCE Device Driver

The test-bed environment for Scalable Aura is the Cortex-1 cluster . Each node of the cluster runs RedHat Linux, therefore a low-level device driver (/dev/presdrv) was written for PCI PRESENCE. The device driver is inserted into the linux kernel as a kernel-space module. A static library (hw_ops) enables the user to access the driver via the ioctl() system routine. A list of the library functions are available on-line. The driver has been extended to allow parallel operation of multiple cards (maximum of 5) in a system.

Multi-PRESENCE Scalability

A HardwareCMM class was added to the AURA library that addresses one or more PRESENCE cards on a systems PCI-bus. A maximum of 5 PCI cards can be attached to a node, boosting available CMM weights memory to 640MByte. The diagram below illustrates how the HardwareCMM is constructed.

image:ioctl call

Multi-Node Scalability

A client-server framework was written that allows a HardwareCMM to be accessed remotely over the cluster. The framework uses the object-oriented Adaptive Communications Environment (ACE), an open-source networking toolkit that is portable across platforms. A DistributedCMM class was created that utilises the client-server framework to distribute a CMM over multiple HardwareCMMs, and hence multiple PRESENCE cards.

image:distributed CMM

Document Actions

Print this

SEARCH

Latest News

THE award

ACAG Wins Top Award

find out more

OPPORTUNITIES

PHD

STUDENTSHIPS

Log in: Login Name

Password

Cookies are not enabled. You must enable cookies before you can log in.; Forgot your password?

INTRANET: Group Pages

Please refer to the legal disclaimer covering content on this site.