您的位置: 专家智库 > >

国家自然科学基金(60703017)

作品数:12 被引量:59H指数:4
相关作者:胡伟武刘金刚李祖松章隆兵高翔更多>>
相关机构:中国科学院中国科学院研究生院北京龙芯中科技术服务中心有限公司更多>>
发文基金:国家自然科学基金国家高技术研究发展计划国家重点基础研究发展计划更多>>
相关领域:自动化与计算机技术电子电信更多>>

文献类型

  • 12篇期刊文章
  • 1篇会议论文

领域

  • 10篇自动化与计算...
  • 3篇电子电信

主题

  • 6篇处理器
  • 3篇龙芯
  • 2篇多核
  • 2篇体系结构
  • 1篇堆栈
  • 1篇多处理器
  • 1篇多核处理
  • 1篇多核处理器
  • 1篇多线程
  • 1篇多线程设计
  • 1篇性能分析
  • 1篇一致性
  • 1篇容错
  • 1篇容错技术
  • 1篇三维鼠标
  • 1篇鼠标
  • 1篇瞬态故障
  • 1篇同时多线程
  • 1篇碰撞检测
  • 1篇片上多处理器

机构

  • 10篇中国科学院
  • 3篇中国科学院研...
  • 1篇南昌大学
  • 1篇首都师范大学
  • 1篇中国石油大学...
  • 1篇中国科学院自...
  • 1篇北京龙芯中科...
  • 1篇中国科学院大...

作者

  • 4篇胡伟武
  • 2篇刘金刚
  • 2篇章隆兵
  • 2篇李祖松
  • 1篇刘奇
  • 1篇杨丽琼
  • 1篇蔡嵩松
  • 1篇王焕东
  • 1篇肖俊华
  • 1篇曹非
  • 1篇黄海明
  • 1篇史岗
  • 1篇胡明昌
  • 1篇黄令仪
  • 1篇周国建
  • 1篇高茁
  • 1篇张仕健
  • 1篇许先超
  • 1篇陈云霁
  • 1篇刘云根

传媒

  • 2篇Journa...
  • 2篇计算机研究与...
  • 2篇计算机学报
  • 2篇计算机工程
  • 2篇Journa...
  • 1篇计算机应用研...
  • 1篇小型微型计算...

年份

  • 1篇2010
  • 6篇2009
  • 6篇2008
12 条 记 录,以下是 1-10
排序方式:
Design and analysis of a UWB low-noise amplifier in the 0.18μm CMOS process
2009年
An ultra-wideband (3.1-10.6 GHz) low-noise amplifier using the 0.18μm CMOS process is presented. It employs a wideband filter for impedance matching. The current-reused technique is adopted to lower the power consumption. The noise contributions of the second-order and third-order Chebyshev fliers for input matching are analyzed and compared in detail. The measured power gain is 12.4-14.5 dB within the bandwidth. NF ranged from 4.2 to 5.4 dB in 3.1-10.6 GHz. Good input matching is achieved over the entire bandwidth. The test chip consumes 9 mW (without output buffer for measurement) with a 1.8 V power supply and occupies 0.88 mm^2.
杨袆高茁杨丽琼黄令仪胡伟武
关键词:ULTRA-WIDEBANDCMOS
支持V2显示芯片LVDS输出的Linux驱动被引量:1
2010年
采用自主研发的龙芯2F处理器芯片,设计并实现了ETX计算机主板.该主板选用V2显示芯片,支持VGA与LVDS两个显示端口同时显示,分辨率达1600×1200./Linux显示驱动原始代码已经实现了V2显示芯片的VGA显示功能,但对LVDS显示功能的支持尚不完备.为支持V2显示芯片的LVDS端口输出,需要对Linux显示驱动程序作一系列改进,才能实现上述显示效果.这里介绍在L inux驱动源码中,针对V2显示芯片的LVDS端口输出所作的一系列改进优化工作.
朱晓静褚越杰胡明昌李正民
关键词:显示芯片LVDS驱动程序LINUX内核
基于四阶段人工优化的软件流水技术被引量:1
2009年
代码体积是优化存储资源有限的嵌入式系统的重要因素之一。针对该特点,使用oprofile性能分析工具,以EEMBC基准程序集作为工作负载,提出四阶段人工优化软件流水方法(FPMO)。电信类的自相关程序实验结果表明,FPMO以2.04%的代码增量为代价换来40.678%的性能提升,而单纯的编译器自动优化则以33.35%的体积膨胀换来38.33%的性能提升。
周国建吴少刚李祖松史岗
关键词:性能分析
基于三维桌面操作系统的3D游戏设计与实现被引量:1
2008年
为了实现内嵌于三维桌面操作系统的一款3D益智小游戏,结合游戏具体的开发过程,讨论并解决了在设计中遇到的随机切割、三维鼠标拾取操作和碰撞检测关键问题。通过使用包围盒检测、相交检测和方向向量检测方法加速了游戏的碰撞检测处理,使渲染速度提升了34.8%,最终游戏能够以31 fps的速度流畅地运行。
田子潇黄海明刘云根刘金刚
关键词:三维鼠标碰撞检测
A PVT Tolerant Sub-mA PLL for High Speed Links被引量:2
2008年
A sub-mA phase-locked loop fabricated in a 65nm standard digital CMOS process is presented. The impact of process variation is largely removed by a novel open-loop calibration that is performed only during start-up but is opened during normal operation. This method reduces calibration time significantly compared with its closed-loop counterpart. The dual-loop PLL architecture is adopted to achieve a process-independent damping factor and pole-zero separation. A new phase frequency detector embedded with a level shifter is introduced. Careful power partitioning is explored to minimize the noise coupling. The proposed PLL achieves 3. lps RMS jitter running at 1.6GHz while consuming only 0.94mA.
杨祎杨丽琼张锋高茁黄令仪胡伟武
关键词:PLLJITTER
龙芯3号互联系统的设计与实现被引量:23
2008年
龙芯3号的互联结构设计采用了一种基于二维Mesh的可伸缩分布式多核结构,可为芯片级、主板级和系统级的互联提供统一的拓扑结构和逻辑设计.龙芯3号的对外接口采用扩展的HyperTransport协议,既可以用于连接IO,又可以实现多芯片的互联.在龙芯3号的互联结构中还设置了软件路由配置机制,可以在板级直接构筑中等规模的CC-NUMA系统和更大规模的NCC-NUMA系统,提供高效的通信机制.介绍了基于龙芯3号的多处理器系统互联架构.采用了双层可伸缩互联结构:片内由二维Mesh连接多个结点,结点内由交叉开关连接多个处理器核和二级缓存模块.片间无需额外硬件支持即可通过支持缓存一致性的HyperTransport接口实现16核的多处理器系统.利用层次化目录技术,龙芯3号还可以支持更大规模的多处理器系统.龙芯3号的互联架构为搭建简洁、高效、灵活、高度可扩展的共享存储多处理器系统提供了有力支持.
王焕东高翔陈云霁胡伟武
关键词:多核体系结构互联处理器
Making Effective Decisions in Computer Architects' Real-World:Lessons and Experiences with Godson-2 Processor Designs
2008年
Although the design of many kinds of microprocessors has been under developing for several decades, the computer architecture R&D community lacks well documented lessons and experiences about design decisions in the research literature. In this paper, we systematically present the design decisions we made during the designing and prototyping of Godson-2 series processors. The 250MHz Godson-2B, 450MHz Godson-2C, and 1GHz Godson-2E processors that implement 64-bit, four-issue, out-of-order architecture were taped out in 2003, 2004, and 2005, respectively. Each processor triples its predecessor in the SPEC CPU2000 rates. Our first-hand experiences and lessons gained from these designs would provide unique perspectives and insights that are not available in any existing text books and/or published papers. We summarize 10 critical lessons and experiences based on hundreds of our attempts at architectural and design optimizations for performance improvement of Godson-2 series processors. The issues include silicon-simulation correlation, design balancing, performance optimizing, and pico-architecture tuning. We conclude that persistent improvement, attitude towards work-on-silicon design, and insightful understanding of software and fabrication process are the three most important factors for designing a high performance processor with low energy consumption.
胡伟武王剑
片上多处理器中延迟和容量权衡的cache结构被引量:4
2009年
片上多处理器中二级cache的设计面临着延迟和容量不能同时满足的矛盾,私有结构有较小的命中延迟但是减少了cache的有效容量,共享结构能增加cache的有效容量但是有较长的命中延迟.提出了一种适用于CMP的cache结构——延迟和容量权衡的cache结构(TCLC).该结构是一种混合私有结构和共享结构的设计,核心思想是动态识别cache块的共享类型,根据不同共享类型分别对其进行优化,对私有cache块采用迁移的优化策略,对共享只读cache块采用复制的优化策略,对共享读写cache块采用中心放置的优化策略,以期达到访问延迟接近私有结构,有效容量接近共享结构的目的,从而缓解线延迟的影响,减少平均内存访问延迟.全系统模拟的实验结果表明,采用TCLC结构,相对于私有结构性能平均提高13.7%,相对于共享结构性能平均提高12%.
肖俊华冯子军章隆兵
关键词:片上多处理器二级CACHE迁移
基于龙芯处理器的二进制翻译器优化被引量:14
2009年
二进制翻译是实现系统迁移的主要方法,但基于通用平台的仅靠软件实现的二进制翻译性能不高。该文以龙芯2F处理器为实现平台,提出一种QEMU二进制翻译器并进行优化,其中包括编译环境的优化以及二进制翻译器本身的优化2个方面,对后者的优化主要涉及寄存器直接映射和多媒体指令的改进。实验结果表明,通过寄存器映射优化后,系统能够获得1.45的加速比,通过多媒体优化后,多媒体程序的执行能达到本地机器执行的80%的性能。
蔡嵩松刘奇王剑刘金刚
关键词:寄存器堆栈
Chip Multithreaded Consistency Model
2008年
Multithreaded technique is the developing trend of high performance processor. Memory consistency model is essential to the correctness, performance and complexity of multithreaded processor. The chip multithreaded consistency model adapting to multithreaded processor is proposed in this paper. The restriction imposed on memory event ordering by chip multithreaded consistency is presented and formalized. With the idea of critical cycle built by Wei-Wu Hu, we prove that the proposed chip multithreaded consistency model satisfies the criterion of correct execution of sequential consistency model. Chip multithreaded consistency model provides a way of achieving high performance compared with sequential consistency model and easures the compatibility of software that the execution result in multithreaded processor is the same as the execution result in uniprocessor. The implementation strategy of chip multithreaded consistency model in Godson-2 SMT processor is also proposed. Godson-2 SMT processor supports chip multithreaded consistency model correctly by exception scheme based on the sequential memory access queue of each thread.
李祖松郇丹丹胡伟武唐志敏
关键词:GODSON-2MULTITHREADING
共2页<12>
聚类工具0