Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Fun Stuff to try: using Video Memory as a Block Device
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks
View previous topic :: View next topic  
Author Message
mdeininger
Veteran
Veteran


Joined: 15 Jun 2005
Posts: 1740
Location: Emerald Isles, observing Dublin's docklands

PostPosted: Tue Oct 25, 2005 9:19 am    Post subject: Fun Stuff to try: using Video Memory as a Block Device Reply with quote

I found this interesting piece while looking for some nvram docs:
http://hedera.linuxnews.pl/_news/2002/09/03/_long/1445.html

hope nobody posted that yet, couldn't find it with a search so i guess i'd post about it *winks*

In short that's a howto on how to use odd ranges of memory as a block device, for example you can use that unused video memory on your graphics card for temporary storage. For that you have go about as follows (if you read that link above, note that you don't need xfree):


  • Compile the MTD stuff as modules (it should work if you compile it in, but then you'd have to reboot...)
    in kernel 2.6 you need "Device Drivers"->"Memory Technology Devices" and "Device Drivers"->"Memory Technology Devices"->"Caching Block Device access to MTD devices" compiled in or as a module, and "Device Drivers"->"Memory Technology Devices"->"Self-contained MTD Device Drivers"->"Physical System RAM" as a module.
    (the guide says to use the slram driver, but the phram driver seems to be more recent)
  • Find an area of RAM to use
    use cat /proc/pci (or lspci) to figure out where your video card stores its memory. nVidia cards seem to have two memory areas, use the one that reads "prefetchable" in /proc/pci or the one that has the right amount of graphics memory with lspci.

    for example, the card in my server box has the /proc/pci entry
    Code:

      Bus  1, device   0, function  0:
        VGA compatible controller: nVidia Corporation NV18 [GeForce4 MX 440SE AGP 8x] (rev 162).
          IRQ 11.
          Master Capable.  Latency=64.  Min Gnt=5.Max Lat=1.
          Non-prefetchable 32 bit memory at 0xe8000000 [0xe8ffffff].
          Prefetchable 32 bit memory at 0xe4000000 [0xe7ffffff].

    and i used 0xe4000000.
  • Calculate the right starting address for your memory device.
    This is easy if you're using a headless box: just add a "Safety Mibibyte" to the base address and you should be fine:
    Code:

     0xe4000000+0x00100000 (1MiB in hex) = 0xe4100000

    If you're using X11, you need to reserve some memory for that -- you can calculate that by doing some math: max x-resolution * max y-resolution * max bits-per-pixel * some pages-just-to-be-sure. You will probably end up with one or two megabytes. Give it four MiB if you have enough VRAM and you're on the safe side ;)
    Code:

     0xe4000000+0x00400000 (4MiB in hex) = 0xe4400000

    you also need to restrict XOrg to this amount of memory by adding
    Code:

     VideoRam 4096

    to your graphics cards' device definition in /etc/X11/xorg.conf.
  • load the bugger and see if it worked :D
    the syntax for loading the module is like this:
    Code:

     # modprobe phram phram=<name>,<start>,<length>

    so i for one used:
    Code:

     # modprobe phram phram=vram,0xe4100000,63Mi


    you can see if it worked by doing a cat /proc/mtd. a good result looks like:
    Code:

    chronos linux # cat /proc/mtd
    dev:    size   erasesize  name
    mtd0: 03e00000 00001000 "vram"


  • If it worked, load mtdblock
    Code:

     # modprobe mtdblock


  • Now you should be able to use /dev/mtdblock0 like any other regular block device
    Put a filesystem on it or use it as swap, whatever you feel like.

  • Grin, be happy and think that's too g33k1sh to be of any apparent use :D


------
jeez, linux does have some odd things you can do with it doesn't it? :D
come to think of it, that could actually come in handy as swap space...
Back to top
View user's profile Send private message
rojaro
l33t
l33t


Joined: 06 May 2002
Posts: 732

PostPosted: Tue Oct 25, 2005 9:44 am    Post subject: Reply with quote

Hi,

thats somewhat interessting ... i've got a small file server running which is equiped with an older GF2 card which has 64MB of RAM onboard. Since i almost never have a monitor attached to it, this piece of memory could be used for something else. But i cant think of anything usefull to do with that "won" memory.

- rojaro -
_________________
A mathematician is a machine for turning coffee into theorems. ~ Alfred Renyi (*1921 - †1970)
Back to top
View user's profile Send private message
mdeininger
Veteran
Veteran


Joined: 15 Jun 2005
Posts: 1740
Location: Emerald Isles, observing Dublin's docklands

PostPosted: Tue Oct 25, 2005 11:04 am    Post subject: Reply with quote

well, it's the same for me, i've no clue what to use that for on my fileserver box... it could be handy for my router -- the os on that thing is 8 megs and its running off a 64 megs compactflash; i could squeeze that into the gfx ram on boot, so that the frequently written-to logfiles don't kill the flashcard or make a cow filesystem with the flash memory ro and everything written going to the gfx ram. but the logs are already in main ram i think -- + the router is running freebsd anyway so that's not gonna help there

*shrugs* i made an fs on that and put my logs on it -- that way i got a ramdisk that doesn't eat "real" ram but still acts like one.

i *think* the driver was originally intended to be used for any excess memory that your motherboard couldn't handle, for example some old boards where the bios couldn't handle more than 64 megs of ram could contain dimms with more and you're supposedly able to use that driver on any ram above 64 megs and then you could make that excess memory a swapfile, thus gaining your full ram. dirty hack but supposedly works.

it's interesting to see it works on gfx memory tho :D
Back to top
View user's profile Send private message
arach
Tux's lil' helper
Tux's lil' helper


Joined: 22 Jan 2005
Posts: 92
Location: Between the Moon and a star

PostPosted: Fri Oct 28, 2005 12:24 am    Post subject: Reply with quote

I wonder how fast this VRAM is, if it's not slow as hell it could be used as a swap device ;)
_________________
ble! :P
Back to top
View user's profile Send private message
artworcs
Tux's lil' helper
Tux's lil' helper


Joined: 12 Jun 2005
Posts: 126

PostPosted: Fri Oct 28, 2005 9:47 pm    Post subject: Reply with quote

I've followed the instructions and managed to create a 48Mb ext2 partition in the video memory. But it seems a little slow. hdparm gives about 5Mb/s and file copy is ~ 2Mb/s.
Back to top
View user's profile Send private message
NewBlackDak
Guru
Guru


Joined: 02 Nov 2003
Posts: 512
Location: Utah County, UT

PostPosted: Sat Oct 29, 2005 7:36 am    Post subject: Reply with quote

I wonder if faster video ram would make faster mem?

Anyone tried it with PCI-E cards?
_________________
Gentoo systems.
X2 4200+@2.6 - Athy
X2 3600+ - Myth
UltraSparc5 440 - sparcy
Back to top
View user's profile Send private message
nephros
Advocate
Advocate


Joined: 07 Feb 2003
Posts: 2139
Location: Graz, Austria (Europe - no kangaroos.)

PostPosted: Sat Oct 29, 2005 10:10 am    Post subject: Reply with quote

Sick -- sick! -- bastards. 8)

Now if I read this correctly a little typo in one of the modprobe lines could have you overwriting your BIOS/CMOS nvram right?
_________________
Please put [SOLVED] in your topic if you are a moron.
Back to top
View user's profile Send private message
mdeininger
Veteran
Veteran


Joined: 15 Jun 2005
Posts: 1740
Location: Emerald Isles, observing Dublin's docklands

PostPosted: Mon Oct 31, 2005 3:21 pm    Post subject: Reply with quote

lol, i dunno if it could do that actually -- that would probably only work if linux maps the bios to somewhere since it's running in protected mode? if thought you could only get access to the bios flash area in real mode on x86? now the cmos bit, that might work ... :D

about that thing with the vram being slow... i think that could be a problem with the graphics card. i only get about 5megs/sec myself but i thought that was due to my garphics card being slow, old and in an agp 1x slot ;).
still i agree, 5megs/sec is rather slow and makes the thing useless -- as NewBlackDak said, did anyone try this with a pci-e card? :)
_________________
"Confident, lazy, cocky, dead." -- Felix Jongleur, Otherland

( Twitter | Blog | GitHub )
Back to top
View user's profile Send private message
artworcs
Tux's lil' helper
Tux's lil' helper


Joined: 12 Jun 2005
Posts: 126

PostPosted: Fri Dec 16, 2005 8:13 pm    Post subject: Reply with quote

I know that its an old post, but i recently found out that the limitation is in the AGP. AGP has a bandwith of 2.1GB/s processor->videocard, but only 0.2GB/s in the oposite way. This limitation was for a simple reason: because of the design of the AGP for games, it was never imagined that the information sould be sent anywhere else but to the screen. This limitaion does not exist with a PCI Express card as it has 4G bandwith both ways.

offtopic: one can write shader programs to take advantage of the processing power of modern graphics cards. As a demonstration of the processing power of modern graphics card, here is a small benchmark:
Sorting of an array using QuickSort containing 18000000 element takes ~17 sec on a P4 3.4GHz but only ~2 sec on a GeForce 6800 Ultra.
More information can be found on: http://gpgpu.org/
Back to top
View user's profile Send private message
mdeininger
Veteran
Veteran


Joined: 15 Jun 2005
Posts: 1740
Location: Emerald Isles, observing Dublin's docklands

PostPosted: Fri Jan 13, 2006 10:49 am    Post subject: Reply with quote

ah, that explains a lot! actually, that even explains why framebuffer effects are so very much slower on an x86 compared to -- say -- an n64.
thanks very much for that information *smile*
_________________
"Confident, lazy, cocky, dead." -- Felix Jongleur, Otherland

( Twitter | Blog | GitHub )
Back to top
View user's profile Send private message
pilo
Tux's lil' helper
Tux's lil' helper


Joined: 31 Jan 2003
Posts: 90
Location: Sweden

PostPosted: Sat Jan 14, 2006 10:00 pm    Post subject: Reply with quote

Code:

hdparm -tT /dev/mtdblock0

/dev/mtdblock0:
 Timing cached reads:   2724 MB in  2.00 seconds = 1361.91 MB/sec
HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device
 Timing buffered disk reads:   12 MB in  3.43 seconds =   3.50 MB/sec
HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device


time cp gives about 30-35MiB/s

This on a 6600GT PCI-E.
_________________
"A stroll through a lunatic asylum shows that faith does not prove anything."
Back to top
View user's profile Send private message
tcx
n00b
n00b


Joined: 23 Mar 2005
Posts: 32
Location: Porto, Portugal

PostPosted: Wed Mar 07, 2007 3:49 pm    Post subject: Reply with quote

Well, I used it on my headless mac mini G4 for a swap device.
I wrote an init script that did a mkswap /dev/mtdblock0 and swapon -p 1 /dev/mtdblock0 after loading the modules.
This way this swap had to be filled in order for the hdd swap to start being used.
Better to use this "free space" as a swap area than no to use it at all.
The video card is a ATI with 64MB ram.
Back to top
View user's profile Send private message
suicidal_orange_II
Apprentice
Apprentice


Joined: 04 Sep 2004
Posts: 299

PostPosted: Sat Jul 14, 2007 5:45 pm    Post subject: Reply with quote

I was just playing about with this, but it doesn't seem to work. I get a device of the specified size that I can partition, but on writing (in fdisk) it complains and can't re-read the partition table.

I disabled X just to test this, so followed the example to the letter :(

I'm using a geforce 8800gts, which shows as having

lspci -v:

01:00.0 VGA compatible controller: nVidia Corporation G80 [GeForce 8800 GTS] (rev a2) (prog-if 00 [VGA])
        Subsystem: nVidia Corporation Unknown device 0420
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at e2000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (64-bit, prefetchable) [size=256M]
        Memory at e0000000 (64-bit, non-prefetchable) [size=32M]
        I/O ports at 3000 [size=128]
        Capabilities: [60] Power Management version 2
        Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
        Capabilities: [78] Express Endpoint IRQ 0

Anyone know if it showing the wrong memory size as prefetchable is a problem? its a 320mb card.

If anyone can direct me to some "further reading" I'd happily read it, but all I find are patches and changelogs, and nothing when addding 8800gts

Thanks in advance,

Suicidal_Orange
Back to top
View user's profile Send private message
tr3
n00b
n00b


Joined: 06 Aug 2007
Posts: 4
Location: Italy

PostPosted: Mon Aug 06, 2007 10:40 pm    Post subject: Reply with quote

suicidal_orange_II wrote:
I was just playing about with this, but it doesn't seem to work. I get a device of the specified size that I can partition, but on writing (in fdisk) it complains and can't re-read the partition table.

I disabled X just to test this, so followed the example to the letter :(

I'm using a geforce 8800gts, which shows as having

lspci -v:

01:00.0 VGA compatible controller: nVidia Corporation G80 [GeForce 8800 GTS] (rev a2) (prog-if 00 [VGA])
        Subsystem: nVidia Corporation Unknown device 0420
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at e2000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (64-bit, prefetchable) [size=256M]
        Memory at e0000000 (64-bit, non-prefetchable) [size=32M]
        I/O ports at 3000 [size=128]
        Capabilities: [60] Power Management version 2
        Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
        Capabilities: [78] Express Endpoint IRQ 0

Anyone know if it showing the wrong memory size as prefetchable is a problem? its a 320mb card.

If anyone can direct me to some "further reading" I'd happily read it, but all I find are patches and changelogs, and nothing when addding 8800gts

Thanks in advance,

Suicidal_Orange


hi, i was trying this too (with a geforce 6200) but it doesn't work for me too, i created an ext2fs on the mtd device (mtdblock0) but when i try to write on it i can write only 24Mbytes, then i get an i/o error (if X is running when i get that error i get a lot of glitches on the monitor)
btw, i'm still trying it, if i find a solution i'll tell you.

ps: nv drivers doesn't support VideoRam option (xorg.conf), i had to change a couple of lines in the sources to get X use only 40M of VRAM
Back to top
View user's profile Send private message
jpsollie
Guru
Guru


Joined: 17 Aug 2013
Posts: 323

PostPosted: Mon Apr 03, 2023 8:20 am    Post subject: Reply with quote

I tried this today on a headless machine (ok, screen is simply used for recovery, not day-to-day operation):
Code:

4:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] (rev c1) (prog-if 00 [VGA controller])
        Subsystem: ASRock Incorporation Navi 23 [Radeon RX 6600/6600 XT/6600M]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 176
        IOMMU group: 35
        Region 0: Memory at 48000000000 (64-bit, prefetchable) [size=8G]
        Region 2: Memory at 47f00000000 (64-bit, prefetchable) [size=2M]
        Region 4: I/O ports at 4000 [size=256]
        Region 5: Memory at 82300000 (32-bit, non-prefetchable) [size=1M]
        Expansion ROM at 82400000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [64] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 256 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 16GT/s, Width x16
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 1
                         EmergencyPowerReduction Form Factor Dev Specific, EmergencyPowerReductionInit-
                         FRS-
                         AtomicOpsCap: 32bit+ 64bit+ 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
                         AtomicOpsCtl: ReqEn+
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee00000  Data: 0000
        Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [200 v1] Physical Resizable BAR
                BAR 0: current size: 8GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB
                BAR 2: current size: 2MB, supported: 2MB 4MB 8MB 16MB 32MB 64MB 128MB 256MB
        Capabilities: [240 v1] Power Budgeting <?>
        Capabilities: [270 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Capabilities: [2a0 v1] Access Control Services
                ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        Capabilities: [2d0 v1] Process Address Space ID (PASID)
                PASIDCap: Exec+ Priv+, Max PASID Width: 10
                PASIDCtl: Enable- Exec- Priv-
        Capabilities: [320 v1] Latency Tolerance Reporting
                Max snoop latency: 0ns
                Max no snoop latency: 0ns
        Capabilities: [410 v1] Physical Layer 16.0 GT/s <?>
        Capabilities: [440 v1] Lane Margining at the Receiver <?>
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu


so I loaded phram with:

Code:

modprobe phram phram=VRAM,0x48001000000,0x80000000


checked /proc/mtd:
Code:

cat /proc/mtd
dev:    size   erasesize  name
mtd0: 180000000 00001000 "VRAM"
mtd1: 80000000 00001000 "VRAM"


... but the results were quite disappointing:

Code:

dd if=/dev/mtdblock1 of=/dev/null bs=65536 count=1024
1024+0 records in
1024+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 82.5256 s, 813 kB/s


whereas the kblockd subsystem is eating CPU, all it actually does is a memcpy, according to "perf record":

Code:

  99.71%  kworker/22:1H+k  [kernel.vmlinux]  [k] __memcpy
   0.09%  swapper          [kernel.vmlinux]  [k] acpi_idle_enter
   0.03%  swapper          [kernel.vmlinux]  [k] load_balance
   0.01%  swapper          [kernel.vmlinux]  [k] do_idle


so I'm obviously wondering: is this method still working properly? or pretty outdated and should maybe have a warning added to the subject?
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum