Architecture and C++-programming environment of a highly parallel image signal processor

Research output: Contribution to journalArticleResearchpeer review

Authors

  • J. Kneip
  • M. Ohmacht
  • K. Rönner
  • P. Pirsch
View graph of relations

Details

Original languageEnglish
Pages (from-to)391-408
Number of pages18
JournalMicroprocessing and Microprogramming
Volume41
Issue number5-6
Publication statusPublished - Oct 1995

Abstract

A highly parallel single-chip image signal processor architecture has been derived by analysis of image processing algorithms. Available levels of parallelism and their associated demands on data access, control and complexity of operations were taken into account. The RISC-architecture, called "HiPAR-DSP", consists of a control unit, 16 parallel ASIMD-controlled datapaths with autonomous addressing and instruction selection capability, a local data cache per data path, a shared memory with matrix type data access and a powerful DMA-unit. The proposed architecture was designed by assessing the results of an analysis of characteristic algorithm properties with respect to their inherent parallelization resources, achievable speed up and implementation costs. This resulted in a proper balance between the degree of parallelism and flexibility, leading to a high performance for a wide field of applications. Additional measures were taken to support an efficient high level programmability of the processor. This was achieved by the concurrent implementation of special architectural features and a C++-programming environment. It consists of an adaptation of the GNU C++-compiler and an optimizing assembler, supporting all levels of concurrence offered by the hardware. While most levels of parallelization are kept invisible to the programmer, data-level parallelism is expressed by the programmer using special new data types added to the standard C/C++-data-types. A sustained performance of about 2.0 Gigaoperations per second is achieved by the 100 MHz clocked processor for numerous image processing algorithms, leading to a processing time e.g. for a normalized correlation of a 512 × 512 image with a 32 × 32 correlation mask of 450 ms. Thus, a performance is achieved with a programmable parallel processor architecture that hitherto required the application of a dedicated integrated circuit.

Keywords

    Autonomous, Image processing, Parallel language extension, Parallel VLSI RISC processor, Parallelizing compiler, Shared memory architecture, SIMD controlling

ASJC Scopus subject areas

Cite this

Architecture and C++-programming environment of a highly parallel image signal processor. / Kneip, J.; Ohmacht, M.; Rönner, K. et al.
In: Microprocessing and Microprogramming, Vol. 41, No. 5-6, 10.1995, p. 391-408.

Research output: Contribution to journalArticleResearchpeer review

Kneip J, Ohmacht M, Rönner K, Pirsch P. Architecture and C++-programming environment of a highly parallel image signal processor. Microprocessing and Microprogramming. 1995 Oct;41(5-6):391-408. doi: 10.1016/0165-6074(95)00023-H
Kneip, J. ; Ohmacht, M. ; Rönner, K. et al. / Architecture and C++-programming environment of a highly parallel image signal processor. In: Microprocessing and Microprogramming. 1995 ; Vol. 41, No. 5-6. pp. 391-408.
Download
@article{020566378cd846c29a7ad4165f0e351e,
title = "Architecture and C++-programming environment of a highly parallel image signal processor",
abstract = "A highly parallel single-chip image signal processor architecture has been derived by analysis of image processing algorithms. Available levels of parallelism and their associated demands on data access, control and complexity of operations were taken into account. The RISC-architecture, called {"}HiPAR-DSP{"}, consists of a control unit, 16 parallel ASIMD-controlled datapaths with autonomous addressing and instruction selection capability, a local data cache per data path, a shared memory with matrix type data access and a powerful DMA-unit. The proposed architecture was designed by assessing the results of an analysis of characteristic algorithm properties with respect to their inherent parallelization resources, achievable speed up and implementation costs. This resulted in a proper balance between the degree of parallelism and flexibility, leading to a high performance for a wide field of applications. Additional measures were taken to support an efficient high level programmability of the processor. This was achieved by the concurrent implementation of special architectural features and a C++-programming environment. It consists of an adaptation of the GNU C++-compiler and an optimizing assembler, supporting all levels of concurrence offered by the hardware. While most levels of parallelization are kept invisible to the programmer, data-level parallelism is expressed by the programmer using special new data types added to the standard C/C++-data-types. A sustained performance of about 2.0 Gigaoperations per second is achieved by the 100 MHz clocked processor for numerous image processing algorithms, leading to a processing time e.g. for a normalized correlation of a 512 × 512 image with a 32 × 32 correlation mask of 450 ms. Thus, a performance is achieved with a programmable parallel processor architecture that hitherto required the application of a dedicated integrated circuit.",
keywords = "Autonomous, Image processing, Parallel language extension, Parallel VLSI RISC processor, Parallelizing compiler, Shared memory architecture, SIMD controlling",
author = "J. Kneip and M. Ohmacht and K. R{\"o}nner and P. Pirsch",
year = "1995",
month = oct,
doi = "10.1016/0165-6074(95)00023-H",
language = "English",
volume = "41",
pages = "391--408",
journal = "Microprocessing and Microprogramming",
issn = "0165-6074",
publisher = "Elsevier",
number = "5-6",

}

Download

TY - JOUR

T1 - Architecture and C++-programming environment of a highly parallel image signal processor

AU - Kneip, J.

AU - Ohmacht, M.

AU - Rönner, K.

AU - Pirsch, P.

PY - 1995/10

Y1 - 1995/10

N2 - A highly parallel single-chip image signal processor architecture has been derived by analysis of image processing algorithms. Available levels of parallelism and their associated demands on data access, control and complexity of operations were taken into account. The RISC-architecture, called "HiPAR-DSP", consists of a control unit, 16 parallel ASIMD-controlled datapaths with autonomous addressing and instruction selection capability, a local data cache per data path, a shared memory with matrix type data access and a powerful DMA-unit. The proposed architecture was designed by assessing the results of an analysis of characteristic algorithm properties with respect to their inherent parallelization resources, achievable speed up and implementation costs. This resulted in a proper balance between the degree of parallelism and flexibility, leading to a high performance for a wide field of applications. Additional measures were taken to support an efficient high level programmability of the processor. This was achieved by the concurrent implementation of special architectural features and a C++-programming environment. It consists of an adaptation of the GNU C++-compiler and an optimizing assembler, supporting all levels of concurrence offered by the hardware. While most levels of parallelization are kept invisible to the programmer, data-level parallelism is expressed by the programmer using special new data types added to the standard C/C++-data-types. A sustained performance of about 2.0 Gigaoperations per second is achieved by the 100 MHz clocked processor for numerous image processing algorithms, leading to a processing time e.g. for a normalized correlation of a 512 × 512 image with a 32 × 32 correlation mask of 450 ms. Thus, a performance is achieved with a programmable parallel processor architecture that hitherto required the application of a dedicated integrated circuit.

AB - A highly parallel single-chip image signal processor architecture has been derived by analysis of image processing algorithms. Available levels of parallelism and their associated demands on data access, control and complexity of operations were taken into account. The RISC-architecture, called "HiPAR-DSP", consists of a control unit, 16 parallel ASIMD-controlled datapaths with autonomous addressing and instruction selection capability, a local data cache per data path, a shared memory with matrix type data access and a powerful DMA-unit. The proposed architecture was designed by assessing the results of an analysis of characteristic algorithm properties with respect to their inherent parallelization resources, achievable speed up and implementation costs. This resulted in a proper balance between the degree of parallelism and flexibility, leading to a high performance for a wide field of applications. Additional measures were taken to support an efficient high level programmability of the processor. This was achieved by the concurrent implementation of special architectural features and a C++-programming environment. It consists of an adaptation of the GNU C++-compiler and an optimizing assembler, supporting all levels of concurrence offered by the hardware. While most levels of parallelization are kept invisible to the programmer, data-level parallelism is expressed by the programmer using special new data types added to the standard C/C++-data-types. A sustained performance of about 2.0 Gigaoperations per second is achieved by the 100 MHz clocked processor for numerous image processing algorithms, leading to a processing time e.g. for a normalized correlation of a 512 × 512 image with a 32 × 32 correlation mask of 450 ms. Thus, a performance is achieved with a programmable parallel processor architecture that hitherto required the application of a dedicated integrated circuit.

KW - Autonomous

KW - Image processing

KW - Parallel language extension

KW - Parallel VLSI RISC processor

KW - Parallelizing compiler

KW - Shared memory architecture

KW - SIMD controlling

UR - http://www.scopus.com/inward/record.url?scp=0029386737&partnerID=8YFLogxK

U2 - 10.1016/0165-6074(95)00023-H

DO - 10.1016/0165-6074(95)00023-H

M3 - Article

AN - SCOPUS:0029386737

VL - 41

SP - 391

EP - 408

JO - Microprocessing and Microprogramming

JF - Microprocessing and Microprogramming

SN - 0165-6074

IS - 5-6

ER -