Reserch & Development
We are constantly evolving through continuous research and development. CoCoLink will bring a great innovation in the section of 'System Integration', 'Processor', 'Interconnect' and 'Software'. With these products of next generation which will be developed by CoCoLink, you will have a more satisfying computing experience.
System Integration
Klimax/CliC vHPC series
Multi-functional Many-GPUed vHPC system based on innovative and advanced architecture
- Support many High-performance PCIe-based computing devices (up to 70).
- Support full function Peering(Peer to peer) between all PCIe devices for optimal GPU/MIC centric computing.
- Scientific/Engineering Applications, DBMS Applications, etc.
- Support many PCIe-based high performance computing and network devices.

TileraGX server system
TileraGX72 Manycore processor server system for network applications

Internal interconnection for expanding system
- Up to 72 slots PCIe fabric subsystem, 1:2/non blocking switching


Processor
64/80bit Optimal Processor-core: SnakeHead, PythonHead, DragonHead
- 64/80bit Processor-core based on 'Stack Computer Architecture'
- Very small number of logic gate, Low power consumption
- Highly efficient and optimized for scientific/engineering computing
- 64bit data for memory interface, 80/82bit data for operation
- Simplex instruction set with post-order notation logics
- 64bit instruction word with 3 of 5 bit operation code and 48bit address
- 1~3 Operation per Clock

SnakeHead 1~3 instruction per clock(Avg. 1.25)
64/80bit ALU on '100K logic gate' with 1,280bit stack register
1+GHz, 1 or more FLOP per clock, 8+8 KB L1 Cache
PythonHead 1~4 instruction per clock(Avg. 1.5)
64/80bit ALU on '200K logic gate' with 2,560bit stack register
2+GHz, 1 or more FLOP per clock, 16+16 KB L1 Cache
DragonHead 1~4 instruction per clock(Avg. 1.5)
64/80bit ALU on '300K logic gate' with 5,120bit stack register
2.5+GHz, 1 or more FLOP per clock, 16+32 KB L1 Cache
MeDUSA™, High Performance Vector Processor:
MeDUSA™ is an advanced very high performance processor for 80 bit operation and 64 bit interface.
- 10~40K SnakeHead, PythonHead or DragonHead Processor-cores
- Meshed Back-bone of 10~20Tbps Bandwidth with 5x8/10x6/10x8 Fabric
- 8 Processor-cores per Column with 32KB L1 Cache
- 256 Processor-cores per Block with 2MB L2 Cache
- 40~80 Blocks per Processor with 64/128MB L3 Cache
- Up to 256GB of HBM(High-Bandwidth Memory), 8 bank, >800GB/s

Line up:
- MeDUSA-10, 4G Logic Gates, 10K SnakeHead Cores, 10+ TFLOPS with PCIe x16 Gen.3 Interface(2018)
- MeDUSA-40, 8G Logic Gates, 20K PythonHead Cores, 40+ TFLOPS with PCIe x16 Gen.4 Interface(2020)
- MeDUSA-100, 12G Logic Gates, 40K DragonHead Cores, 100+ TFLOPS with PCIe x16 Gen.4 Interface(2022)



Interconnect
Large scale switch subsystem:
- Infiniband EDR(100Gbps) Switch 48port, 144, 288, 768 port
- Switching Sub-system up to 25,000 ports with 100Gbps Bandwidth
- Fat-tree protocol Research

High-bandwidth Switch IC:
- 256 or more lane with full function switching, ~32 Gbps per rane
- Fat-tree protocol Research



Software
Scientific/Engineering Application

- Great interest in the development of future-oriented applications
- Use GPU Centric Computing Methodology in various fields
- Commercialization:
Atificial Intelligence (Machine Learning/Deep Learning)
High Performance Transcoding and 3D Rendering Applications

Business Application RDBMS on GPU

Developemnt Tools
C-bolt, advanced C like compiler for scientific computation
- Fat-tree protocol Research
C-bolt, advanced C like compiler for scientific computation