View previous topic :: View next topic |
Author |
Message |
mdeininger Veteran
Joined: 15 Jun 2005 Posts: 1740 Location: Emerald Isles, observing Dublin's docklands
|
Posted: Tue Oct 25, 2005 9:19 am Post subject: Fun Stuff to try: using Video Memory as a Block Device |
|
|
I found this interesting piece while looking for some nvram docs:
http://hedera.linuxnews.pl/_news/2002/09/03/_long/1445.html
hope nobody posted that yet, couldn't find it with a search so i guess i'd post about it *winks*
In short that's a howto on how to use odd ranges of memory as a block device, for example you can use that unused video memory on your graphics card for temporary storage. For that you have go about as follows (if you read that link above, note that you don't need xfree):
- Compile the MTD stuff as modules (it should work if you compile it in, but then you'd have to reboot...)
in kernel 2.6 you need "Device Drivers"->"Memory Technology Devices" and "Device Drivers"->"Memory Technology Devices"->"Caching Block Device access to MTD devices" compiled in or as a module, and "Device Drivers"->"Memory Technology Devices"->"Self-contained MTD Device Drivers"->"Physical System RAM" as a module.
(the guide says to use the slram driver, but the phram driver seems to be more recent)
- Find an area of RAM to use
use cat /proc/pci (or lspci) to figure out where your video card stores its memory. nVidia cards seem to have two memory areas, use the one that reads "prefetchable" in /proc/pci or the one that has the right amount of graphics memory with lspci.
for example, the card in my server box has the /proc/pci entry
Code: |
Bus 1, device 0, function 0:
VGA compatible controller: nVidia Corporation NV18 [GeForce4 MX 440SE AGP 8x] (rev 162).
IRQ 11.
Master Capable. Latency=64. Min Gnt=5.Max Lat=1.
Non-prefetchable 32 bit memory at 0xe8000000 [0xe8ffffff].
Prefetchable 32 bit memory at 0xe4000000 [0xe7ffffff].
|
and i used 0xe4000000.
Calculate the right starting address for your memory device.
This is easy if you're using a headless box: just add a "Safety Mibibyte" to the base address and you should be fine:
Code: |
0xe4000000+0x00100000 (1MiB in hex) = 0xe4100000
|
If you're using X11, you need to reserve some memory for that -- you can calculate that by doing some math: max x-resolution * max y-resolution * max bits-per-pixel * some pages-just-to-be-sure. You will probably end up with one or two megabytes. Give it four MiB if you have enough VRAM and you're on the safe side
Code: |
0xe4000000+0x00400000 (4MiB in hex) = 0xe4400000
|
you also need to restrict XOrg to this amount of memory by adding
to your graphics cards' device definition in /etc/X11/xorg.conf.
load the bugger and see if it worked
the syntax for loading the module is like this:
Code: |
# modprobe phram phram=<name>,<start>,<length>
|
so i for one used:
Code: |
# modprobe phram phram=vram,0xe4100000,63Mi
|
you can see if it worked by doing a cat /proc/mtd. a good result looks like:
Code: |
chronos linux # cat /proc/mtd
dev: size erasesize name
mtd0: 03e00000 00001000 "vram"
|
If it worked, load mtdblock
Code: |
# modprobe mtdblock
|
Now you should be able to use /dev/mtdblock0 like any other regular block device
Put a filesystem on it or use it as swap, whatever you feel like.
Grin, be happy and think that's too g33k1sh to be of any apparent use
------
jeez, linux does have some odd things you can do with it doesn't it?
come to think of it, that could actually come in handy as swap space... |
|
Back to top |
|
|
rojaro l33t
Joined: 06 May 2002 Posts: 732
|
Posted: Tue Oct 25, 2005 9:44 am Post subject: |
|
|
Hi,
thats somewhat interessting ... i've got a small file server running which is equiped with an older GF2 card which has 64MB of RAM onboard. Since i almost never have a monitor attached to it, this piece of memory could be used for something else. But i cant think of anything usefull to do with that "won" memory.
- rojaro - _________________ A mathematician is a machine for turning coffee into theorems. ~ Alfred Renyi (*1921 - †1970) |
|
Back to top |
|
|
mdeininger Veteran
Joined: 15 Jun 2005 Posts: 1740 Location: Emerald Isles, observing Dublin's docklands
|
Posted: Tue Oct 25, 2005 11:04 am Post subject: |
|
|
well, it's the same for me, i've no clue what to use that for on my fileserver box... it could be handy for my router -- the os on that thing is 8 megs and its running off a 64 megs compactflash; i could squeeze that into the gfx ram on boot, so that the frequently written-to logfiles don't kill the flashcard or make a cow filesystem with the flash memory ro and everything written going to the gfx ram. but the logs are already in main ram i think -- + the router is running freebsd anyway so that's not gonna help there
*shrugs* i made an fs on that and put my logs on it -- that way i got a ramdisk that doesn't eat "real" ram but still acts like one.
i *think* the driver was originally intended to be used for any excess memory that your motherboard couldn't handle, for example some old boards where the bios couldn't handle more than 64 megs of ram could contain dimms with more and you're supposedly able to use that driver on any ram above 64 megs and then you could make that excess memory a swapfile, thus gaining your full ram. dirty hack but supposedly works.
it's interesting to see it works on gfx memory tho |
|
Back to top |
|
|
arach Tux's lil' helper
Joined: 22 Jan 2005 Posts: 92 Location: Between the Moon and a star
|
Posted: Fri Oct 28, 2005 12:24 am Post subject: |
|
|
I wonder how fast this VRAM is, if it's not slow as hell it could be used as a swap device _________________ ble! |
|
Back to top |
|
|
artworcs Tux's lil' helper
Joined: 12 Jun 2005 Posts: 126
|
Posted: Fri Oct 28, 2005 9:47 pm Post subject: |
|
|
I've followed the instructions and managed to create a 48Mb ext2 partition in the video memory. But it seems a little slow. hdparm gives about 5Mb/s and file copy is ~ 2Mb/s. |
|
Back to top |
|
|
NewBlackDak Guru
Joined: 02 Nov 2003 Posts: 512 Location: Utah County, UT
|
Posted: Sat Oct 29, 2005 7:36 am Post subject: |
|
|
I wonder if faster video ram would make faster mem?
Anyone tried it with PCI-E cards? _________________ Gentoo systems.
X2 4200+@2.6 - Athy
X2 3600+ - Myth
UltraSparc5 440 - sparcy |
|
Back to top |
|
|
nephros Advocate
Joined: 07 Feb 2003 Posts: 2139 Location: Graz, Austria (Europe - no kangaroos.)
|
Posted: Sat Oct 29, 2005 10:10 am Post subject: |
|
|
Sick -- sick! -- bastards.
Now if I read this correctly a little typo in one of the modprobe lines could have you overwriting your BIOS/CMOS nvram right? _________________ Please put [SOLVED] in your topic if you are a moron. |
|
Back to top |
|
|
mdeininger Veteran
Joined: 15 Jun 2005 Posts: 1740 Location: Emerald Isles, observing Dublin's docklands
|
Posted: Mon Oct 31, 2005 3:21 pm Post subject: |
|
|
lol, i dunno if it could do that actually -- that would probably only work if linux maps the bios to somewhere since it's running in protected mode? if thought you could only get access to the bios flash area in real mode on x86? now the cmos bit, that might work ...
about that thing with the vram being slow... i think that could be a problem with the graphics card. i only get about 5megs/sec myself but i thought that was due to my garphics card being slow, old and in an agp 1x slot .
still i agree, 5megs/sec is rather slow and makes the thing useless -- as NewBlackDak said, did anyone try this with a pci-e card? _________________ "Confident, lazy, cocky, dead." -- Felix Jongleur, Otherland
( Twitter | Blog | GitHub ) |
|
Back to top |
|
|
artworcs Tux's lil' helper
Joined: 12 Jun 2005 Posts: 126
|
Posted: Fri Dec 16, 2005 8:13 pm Post subject: |
|
|
I know that its an old post, but i recently found out that the limitation is in the AGP. AGP has a bandwith of 2.1GB/s processor->videocard, but only 0.2GB/s in the oposite way. This limitation was for a simple reason: because of the design of the AGP for games, it was never imagined that the information sould be sent anywhere else but to the screen. This limitaion does not exist with a PCI Express card as it has 4G bandwith both ways.
offtopic: one can write shader programs to take advantage of the processing power of modern graphics cards. As a demonstration of the processing power of modern graphics card, here is a small benchmark:
Sorting of an array using QuickSort containing 18000000 element takes ~17 sec on a P4 3.4GHz but only ~2 sec on a GeForce 6800 Ultra.
More information can be found on: http://gpgpu.org/ |
|
Back to top |
|
|
mdeininger Veteran
Joined: 15 Jun 2005 Posts: 1740 Location: Emerald Isles, observing Dublin's docklands
|
Posted: Fri Jan 13, 2006 10:49 am Post subject: |
|
|
ah, that explains a lot! actually, that even explains why framebuffer effects are so very much slower on an x86 compared to -- say -- an n64.
thanks very much for that information *smile* _________________ "Confident, lazy, cocky, dead." -- Felix Jongleur, Otherland
( Twitter | Blog | GitHub ) |
|
Back to top |
|
|
pilo Tux's lil' helper
Joined: 31 Jan 2003 Posts: 90 Location: Sweden
|
Posted: Sat Jan 14, 2006 10:00 pm Post subject: |
|
|
Code: |
hdparm -tT /dev/mtdblock0
/dev/mtdblock0:
Timing cached reads: 2724 MB in 2.00 seconds = 1361.91 MB/sec
HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device
Timing buffered disk reads: 12 MB in 3.43 seconds = 3.50 MB/sec
HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device
|
time cp gives about 30-35MiB/s
This on a 6600GT PCI-E. _________________ "A stroll through a lunatic asylum shows that faith does not prove anything." |
|
Back to top |
|
|
tcx n00b
Joined: 23 Mar 2005 Posts: 32 Location: Porto, Portugal
|
Posted: Wed Mar 07, 2007 3:49 pm Post subject: |
|
|
Well, I used it on my headless mac mini G4 for a swap device.
I wrote an init script that did a mkswap /dev/mtdblock0 and swapon -p 1 /dev/mtdblock0 after loading the modules.
This way this swap had to be filled in order for the hdd swap to start being used.
Better to use this "free space" as a swap area than no to use it at all.
The video card is a ATI with 64MB ram. |
|
Back to top |
|
|
suicidal_orange_II Apprentice
Joined: 04 Sep 2004 Posts: 299
|
Posted: Sat Jul 14, 2007 5:45 pm Post subject: |
|
|
I was just playing about with this, but it doesn't seem to work. I get a device of the specified size that I can partition, but on writing (in fdisk) it complains and can't re-read the partition table.
I disabled X just to test this, so followed the example to the letter
I'm using a geforce 8800gts, which shows as having
lspci -v: |
01:00.0 VGA compatible controller: nVidia Corporation G80 [GeForce 8800 GTS] (rev a2) (prog-if 00 [VGA])
Subsystem: nVidia Corporation Unknown device 0420
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at e2000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at e0000000 (64-bit, non-prefetchable) [size=32M]
I/O ports at 3000 [size=128]
Capabilities: [60] Power Management version 2
Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
Capabilities: [78] Express Endpoint IRQ 0
|
Anyone know if it showing the wrong memory size as prefetchable is a problem? its a 320mb card.
If anyone can direct me to some "further reading" I'd happily read it, but all I find are patches and changelogs, and nothing when addding 8800gts
Thanks in advance,
Suicidal_Orange |
|
Back to top |
|
|
tr3 n00b
Joined: 06 Aug 2007 Posts: 4 Location: Italy
|
Posted: Mon Aug 06, 2007 10:40 pm Post subject: |
|
|
suicidal_orange_II wrote: | I was just playing about with this, but it doesn't seem to work. I get a device of the specified size that I can partition, but on writing (in fdisk) it complains and can't re-read the partition table.
I disabled X just to test this, so followed the example to the letter
I'm using a geforce 8800gts, which shows as having
lspci -v: |
01:00.0 VGA compatible controller: nVidia Corporation G80 [GeForce 8800 GTS] (rev a2) (prog-if 00 [VGA])
Subsystem: nVidia Corporation Unknown device 0420
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at e2000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at e0000000 (64-bit, non-prefetchable) [size=32M]
I/O ports at 3000 [size=128]
Capabilities: [60] Power Management version 2
Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
Capabilities: [78] Express Endpoint IRQ 0
|
Anyone know if it showing the wrong memory size as prefetchable is a problem? its a 320mb card.
If anyone can direct me to some "further reading" I'd happily read it, but all I find are patches and changelogs, and nothing when addding 8800gts
Thanks in advance,
Suicidal_Orange |
hi, i was trying this too (with a geforce 6200) but it doesn't work for me too, i created an ext2fs on the mtd device (mtdblock0) but when i try to write on it i can write only 24Mbytes, then i get an i/o error (if X is running when i get that error i get a lot of glitches on the monitor)
btw, i'm still trying it, if i find a solution i'll tell you.
ps: nv drivers doesn't support VideoRam option (xorg.conf), i had to change a couple of lines in the sources to get X use only 40M of VRAM |
|
Back to top |
|
|
jpsollie Guru
Joined: 17 Aug 2013 Posts: 323
|
Posted: Mon Apr 03, 2023 8:20 am Post subject: |
|
|
I tried this today on a headless machine (ok, screen is simply used for recovery, not day-to-day operation):
Code: |
4:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] (rev c1) (prog-if 00 [VGA controller])
Subsystem: ASRock Incorporation Navi 23 [Radeon RX 6600/6600 XT/6600M]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 176
IOMMU group: 35
Region 0: Memory at 48000000000 (64-bit, prefetchable) [size=8G]
Region 2: Memory at 47f00000000 (64-bit, prefetchable) [size=2M]
Region 4: I/O ports at 4000 [size=256]
Region 5: Memory at 82300000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 82400000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [64] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 256 bytes
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 16GT/s, Width x16
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 1
EmergencyPowerReduction Form Factor Dev Specific, EmergencyPowerReductionInit-
FRS-
AtomicOpsCap: 32bit+ 64bit+ 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
AtomicOpsCtl: ReqEn+
LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00000 Data: 0000
Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [200 v1] Physical Resizable BAR
BAR 0: current size: 8GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB
BAR 2: current size: 2MB, supported: 2MB 4MB 8MB 16MB 32MB 64MB 128MB 256MB
Capabilities: [240 v1] Power Budgeting <?>
Capabilities: [270 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [2a0 v1] Access Control Services
ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
Capabilities: [2d0 v1] Process Address Space ID (PASID)
PASIDCap: Exec+ Priv+, Max PASID Width: 10
PASIDCtl: Enable- Exec- Priv-
Capabilities: [320 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [410 v1] Physical Layer 16.0 GT/s <?>
Capabilities: [440 v1] Lane Margining at the Receiver <?>
Kernel driver in use: amdgpu
Kernel modules: amdgpu
|
so I loaded phram with:
Code: |
modprobe phram phram=VRAM,0x48001000000,0x80000000
|
checked /proc/mtd:
Code: |
cat /proc/mtd
dev: size erasesize name
mtd0: 180000000 00001000 "VRAM"
mtd1: 80000000 00001000 "VRAM"
|
... but the results were quite disappointing:
Code: |
dd if=/dev/mtdblock1 of=/dev/null bs=65536 count=1024
1024+0 records in
1024+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 82.5256 s, 813 kB/s
|
whereas the kblockd subsystem is eating CPU, all it actually does is a memcpy, according to "perf record":
Code: |
99.71% kworker/22:1H+k [kernel.vmlinux] [k] __memcpy
0.09% swapper [kernel.vmlinux] [k] acpi_idle_enter
0.03% swapper [kernel.vmlinux] [k] load_balance
0.01% swapper [kernel.vmlinux] [k] do_idle
|
so I'm obviously wondering: is this method still working properly? or pretty outdated and should maybe have a warning added to the subject? _________________ The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img] |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|