Maryland Memory-Systems Research: Bibliography

University of Maryland
Memory-Systems Research: Bibliography

Bruce Jacob - - http://www.ece.umd.edu/~blj/

Site Index

Brought to you by The Memory-Systems Research Consortium

The Studies

An annotated bibliography of our work in memory systems:

2007	(book)	*Memory Systems: Cache, DRAM, Disk.* Bruce Jacob, Spencer W. Ng, and David T. Wang, with contributions by Samuel Rodriguez. ISBN 978-0-12-379751-3. Morgan Kaufmann Publishers, Fall 2007.
		This represents the culmination of much of our work. Currently sitting at about 1000 pages, densely set (~500 words per page), it is roughly half a million total words.
	HPCA	"Fully-Buffered DIMM memory architectures: Understanding mechanisms, overheads and scaling." Brinda Ganesh, Aamer Jaleel, David Wang, and Bruce Jacob. Proc. 13th International Symposium on High Performance Computer Architecture (HPCA 2007). Phoenix AZ, February 2007.
		This is the first published peer-reviewed study of the performance of FB-DIMM. Up until this point, the only study publicly available anywhere was a master's thesis (also from our group) by Rami Nasr.
2006	ISLPED	"Energy/power breakdown of pipelined nanometer caches (90nm/65nm/45nm/32nm)." Samuel V. Rodriguez and Bruce Jacob. Proc. International Symposium on Low Power Electronics and Design (ISLPED 2006), pp. 25-30. Tegernsee Germany, October 2006.
		This is the most accurate study of SRAM energy & power yet; Sam's software, vCACTI, is a major overhaul of existing CACTI-based programs. Read his thesis.
	HPCA	"Last-level cache (LLC) performance of data-mining workloads on a CMP--A case study of parallel bioinformatics workloads." Aamer Jaleel, Matthew Mattina, and Bruce Jacob. Proc. 12th International Symposium on High Performance Computer Architecture (HPCA 2006), pp. 88-98. Austin TX, February 2006.
		Few besides Aamer are really looking at what sort of sharing patterns exist in the last-level cache.
	IEEE-TC	"In-line interrupt handling and lock-up free translation lookaside buffers (TLBs)." Aamer Jaleel and Bruce Jacob. IEEE Transactions on Computers, vol. 55, no. 5, pp. 559-574. May 2006.
		One of the problems with software-managed TLBs is that they can clog up a high-performance, deeply pipelined, highly out-of-order architecture. TLB misses happen regularly, and to service them in software, you have to flush the pipe (oops!) ... one of the resons for the popularity of hardware-managed TLBs these days. This is a neat trick that gives you the best of both worlds.
2005	HPCA	"Using Virtual Load/Store Queues (VLSQs) to reduce the negative effects of reordered memory instructions." Aamer Jaleel and Bruce Jacob. Proc. 11th International Symposium on High Performance Computer Architecture (HPCA 2005), pp. 191-200. San Francisco CA, February 2005.
		The funny thing about out-of-order execution is that it is great for general instructions, but it is sucky for memory instructions. Norm Jouppi was one of the first to notice this, and Aamer was his intern at the time ... Norm handed the idea off to Aamer, and the study was born. Took us several years to get people to believe the results.
	ISPASS	"BioBench: A benchmark suite of bioinformatics applications." K. Albayraktaroglu, A. Jaleel, X. Wu, M. Franklin, B. Jacob, C.-W. Tseng, and D. Yeung. Proc. 2005 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2005), pp. 2-9. Austin TX, March 2005.
		A really cool set of benchmarks that pounds the memory system.
	SIGARCH	"DRAMsim: A memory-system simulator." David Wang, Brinda Ganesh, Nuengwong Tuaycharoen, Katie Baynes, Aamer Jaleel, and Bruce Jacob. SIGARCH Computer Architecture News, vol. 33, no. 4, pp. 100-107. September 2005.
		Simply put, the most accurate DRAM system simulator in the world. At least, that we know of.
2003	IEEE Micro	"A case for studying DRAM issues at the system level." Bruce Jacob. IEEE Micro, vol. 23, no. 4, pp. 44-56. July/August 2003.
		DRAM systems have become so complex that they resemble highly out-of-order processors. Lots of concurrency, lots of queueing and scheduling, deep pipelines ... it has gotten to the point that you realy have to take an architectural approach, not just a circuits approach.
2001	ISCA	"Concurrency, latency, or system overhead: Which has the largest impact on uniprocessor DRAM-system performance?" Vinodh Cuppu and Bruce Jacob. Proc. 28th International Symposium on Computer Architecture (ISCA 2001), pp. 62-71. Goteborg Sweden, June 2001.
		The first study to show how complicated the memory-system design space is: it is extremely non-linear (very noisy, not well behaved -- good solutions lie right next to bad solutions), and the cost of poor analysis is huge -- the variance can be 3x or more from worst case to best case, even within a group of "high-performance" configurations.
	CASES	"Transparent data-memory organizations for digital signal processors." Sadagopan Srinivasan, Vinodh Cuppu, and Bruce Jacob. Proc. International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES 2001), pp. 44-48. Atlanta GA, November 2001.
		Explores an interesting design space wherein bandwidth is paid to reduce storage necessary to get good performance in DSP applications. Among other things, shows that you can get excellent performance out of a very small number of cache blocks.
	IEEE-TC	"High performance DRAMs in workstation environments." Vinodh Cuppu, Bruce Jacob, Brian Davis, and Trevor Mudge. IEEE Transactions on Computers, vol. 50, no. 11, pp. 1133-1153. November 2001. (TC Special Issue on High-Performance Memory Systems)
		This is our extended/journal version of the 1999 ISCA study. Among other things, this adds DDR measurements, cache effects (number of MSHRs), etc.
	IEEE-TC	"Uniprocessor virtual memory without TLBs." Bruce Jacob and Trevor Mudge. IEEE Transactions on Computers, vol. 50, no. 5, pp. 482-499. May 2001.
		This is our extended/journal version of the 1997 HPCA study. The original title when we submitted this to IEEE was "Software-managed address translation and software-managed caches," because we give a detailed description of how to build a software-managed cache. However, one reviewer was obstinate about not allowing us to use the term "software-managed cache" in the title, and also refused to accept the paper until after 2000 (we submitted the article in 1998, a full three years before it finally showed up in print). Interesting.
1999	ISCA	"A performance comparison of contemporary DRAM architectures." Vinodh Cuppu, Bruce Jacob, Brian Davis, and Trevor Mudge. Proc. 26th International Symposium on Computer Architecture (ISCA 1999), pp. 222-233. Atlanta GA, May 1999.
		This is the first-ever published study of DRAM-system performance. It is the seminal work on the topic. Cited by scads of researchers.
	CASES	"Hardware/software architectures for real-time caching." Bruce Jacob. Proc. Second Workshop on Compiler and Architecture Support for Embedded Systems (CASES 1999), pp. 135-138, Washington DC, October 1999.
		The paper presents more thought on the idea of software-managed caches, first mentioned in the 1998 ASPLOS paper, below, and also discussed in the 1998 CASES paper. In particular, this paper gives (and is the first to give) an architecture for a fully associative software-managed cache design.
	ESC	"Cache design for embedded real-time systems." Bruce Jacob. Embedded Systems Conference, Summer 1999. Danvers MA, June 1999.
		An extended abstract that goes into a little bit more detail on software-managed caches than the 1998 CASES paper below. The slides for the talk are available on-line in PDF format and include many details not found in the paper.
1998	ASPLOS	"A look at several memory management units, TLB-refill mechanisms, and page table organizations." Bruce Jacob and Trevor Mudge. Proc. Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 1998), pp. 295-306. San Jose CA, October 1998.
		This paper coins the term "software-managed cache" ... first-ever appearance of the now-common term. The paper is the first to show the low-level costs of virtual memory, for example the interrupt costs and the different costs associated with page-table organizations.
	IEEE Computer	"Virtual memory: Issues of implementation." Bruce Jacob and Trevor Mudge. IEEE Computer, vol. 31, no. 6, pp. 33-43. June 1998.
		I've been told that this paper, coupled with the IEEE Micro article below, is considered by many to be the definitive reference on virtual memory, and the two papers are what many graduate students use to study for their quals on the topic.
	IEEE Micro	"Virtual memory in contemporary microprocessors." Bruce Jacob and Trevor Mudge. IEEE Micro, vol. 18, no. 4, pp. 60-75. July/August 1998.
		I've been told that this paper, coupled with the IEEE Computer article above, is considered by many to be the definitive reference on virtual memory, and the two papers are what many graduate students use to study for their quals on the topic.
	CASES	"Software-managed caches: Architectural support for real-time embedded systems." Bruce Jacob. In CASES 1998: Workshop on Compiler and Architecture Support for Embedded Systems. Washington DC, December 1998.
		An extended abstract that introduces the idea of software-managed caches. The slides for the talk are available on-line in PDF format and include many details not found in the paper.
1997	HPCA	"Software-managed address translation." Bruce Jacob and Trevor Mudge. Proc. Third International Symposium on High Performance Computer Architecture (HPCA 1997), pp. 156-167, San Antonio TX, February 1997.
		Starts with the "in-cache address translation" concept and takes it a step further: i.e., what if we dispense with page-table-walking hardware altogether? (SPUR dispensed with the TLB and used a hardware table walker to probe the cache) Among other things, it simplifies the hardware design (potentially making it simpler to verify, etc.) and gives more flexibility to the operating system (potentially enabling real-time guarantees, etc.).
1996	IEEE-TC	"An analytical model for designing memory hierarchies." Bruce Jacob, Peter Chen, Seth Silverman, and Trevor Mudge. IEEE Transactions on Computers, vol. 45, no. 10, pp. 1180-1194. October 1996.
		Yet another analytical cache-modeling paper. The difference: this one is correct. :)

Interested? Talk to us.

Contact Information

Prof. Bruce Jacob - email address

- http://www.ece.umd.edu/~blj/

Traditional correspondence can be sent to

Prof. Bruce Jacob
Dept. of Electrical & Computer Engineering
University of Maryland
College Park, MD 20742

Last updated: recently by Bruce Jacob ( email address

) using the vi text editor ... best viewed in Safari