Mostrando entradas con la etiqueta Linux. Mostrar todas las entradas
Mostrando entradas con la etiqueta Linux. Mostrar todas las entradas

13/3/11

Page Cache, the Affair Between Memory and Files

Previously we looked at how the kernel manages virtual memory for a user process, but files and I/O were left out. This post covers the important and often misunderstood relationship between files and memory and its consequences for performance.

Two serious problems must be solved by the OS when it comes to files. The first one is the mind-blowing slowness of hard drives, and disk seeks in particular, relative to memory. The second is the need to load file contents in physical memory once and share the contents among programs. If you use Process Explorer to poke at Windows processes, you’ll see there are ~15MB worth of common DLLs loaded in every process. My Windows box right now is running 100 processes, so without sharing I’d be using up to ~1.5 GB of physical RAM just for common DLLs. No good. Likewise, nearly all Linux programs need ld.so and libc, plus other common libraries.

Happily, both problems can be dealt with in one shot: the page cache, where the kernel stores page-sized chunks of files. To illustrate the page cache, I’ll conjure a Linux program named render, which opens file scene.dat and reads it 512 bytes at a time, storing the file contents into a heap-allocated block. The first read goes like this:
Read from Page Cache
After 12KB have been read, render‘s heap and the relevant page frames look thus:
Non-Mapped File Read
This looks innocent enough, but there’s a lot going on. First, even though this program uses regular read calls, three 4KB page frames are now in the page cache storing part of scene.dat. People are sometimes surprised by this, but all regular file I/O happens through the page cache. In x86 Linux, the kernel thinks of a file as a sequence of 4KB chunks. If you read a single byte from a file, the whole 4KB chunk containing the byte you asked for is read from disk and placed into the page cache. This makes sense because sustained disk throughput is pretty good and programs normally read more than just a few bytes from a file region. The page cache knows the position of each 4KB chunk within the file, depicted above as #0, #1, etc. Windows uses 256KB views analogous to pages in the Linux page cache.

Sadly, in a regular file read the kernel must copy the contents of the page cache into a user buffer, which not only takes cpu time and hurts the cpu caches, but also wastes physical memory with duplicate data. As per the diagram above, the scene.dat contents are stored twice, and each instance of the program would store the contents an additional time. We’ve mitigated the disk latency problem but failed miserably at everything else. Memory-mapped files are the way out of this madness:
Mapped File Read
When you use file mapping, the kernel maps your program’s virtual pages directly onto the page cache. This can deliver a significant performance boost: Windows System Programming reports run time improvements of 30% and up relative to regular file reads, while similar figures are reported for Linux and Solaris in Advanced Programming in the Unix Environment. You might also save large amounts of physical memory, depending on the nature of your application.

As always with performance, measurement is everything, but memory mapping earns its keep in a programmer’s toolbox. The API is pretty nice too, it allows you to access a file as bytes in memory and does not require your soul and code readability in exchange for its benefits. Mind your address space and experiment with mmap in Unix-like systems, CreateFileMapping in Windows, or the many wrappers available in high level languages. When you map a file its contents are not brought into memory all at once, but rather on demand via page faults. The fault handler maps your virtual pages onto the page cache after obtaining a page frame with the needed file contents. This involves disk I/O if the contents weren’t cached to begin with.

Now for a pop quiz. Imagine that the last instance of our render program exits. Would the pages storing scene.dat in the page cache be freed immediately? People often think so, but that would be a bad idea. When you think about it, it is very common for us to create a file in one program, exit, then use the file in a second program. The page cache must handle that case. When you think more about it, why should the kernel ever get rid of page cache contents? Remember that disk is 5 orders of magnitude slower than RAM, hence a page cache hit is a huge win. So long as there’s enough free physical memory, the cache should be kept full. It is therefore not dependent on a particular process, but rather it’s a system-wide resource. If you run render a week from now and scene.dat is still cached, bonus! This is why the kernel cache size climbs steadily until it hits a ceiling. It’s not because the OS is garbage and hogs your RAM, it’s actually good behavior because in a way free physical memory is a waste. Better use as much of the stuff for caching as possible.

Due to the page cache architecture, when a program calls write() bytes are simply copied to the page cache and the page is marked dirty. Disk I/O normally does not happen immediately, thus your program doesn’t block waiting for the disk. On the downside, if the computer crashes your writes will never make it, hence critical files like database transaction logs must be fsync()ed (though one must still worry about drive controller caches, oy!). Reads, on the other hand, normally block your program until the data is available. Kernels employ eager loading to mitigate this problem, an example of which is read ahead where the kernel preloads a few pages into the page cache in anticipation of your reads. You can help the kernel tune its eager loading behavior by providing hints on whether you plan to read a file sequentially or randomly (see madvise(), readahead(), Windows cache hints). Linux does read-ahead for memory-mapped files, but I’m not sure about Windows. Finally, it’s possible to bypass the page cache using O_DIRECT in Linux or NO_BUFFERING in Windows, something database software often does.

A file mapping may be private or shared. This refers only to updates made to the contents in memory: in a private mapping the updates are not committed to disk or made visible to other processes, whereas in a shared mapping they are. Kernels use the copy on write mechanism, enabled by page table entries, to implement private mappings. In the example below, both render and another program called render3d (am I creative or what?) have mapped scene.dat privately. Render then writes to its virtual memory area that maps the file:
Copy On Write
The read-only page table entries shown above do not mean the mapping is read only, they’re merely a kernel trick to share physical memory until the last possible moment. You can see how ‘private’ is a bit of a misnomer until you remember it only applies to updates. A consequence of this design is that a virtual page that maps a file privately sees changes done to the file by other programs as long as the page has only been read from. Once copy-on-write is done, changes by others are no longer seen. This behavior is not guaranteed by the kernel, but it’s what you get in x86 and makes sense from an API perspective. By contrast, a shared mapping is simply mapped onto the page cache and that’s it. Updates are visible to other processes and end up in the disk. Finally, if the mapping above were read-only, page faults would trigger a segmentation fault instead of copy on write.

Dynamically loaded libraries are brought into your program’s address space via file mapping. There’s nothing magical about it, it’s the same private file mapping available to you via regular APIs. Below is an example showing part of the address spaces from two running instances of the file-mapping render program, along with physical memory, to tie together many of the concepts we’ve seen.
Virtual To Physical Mapping
This concludes our 3-part series on memory fundamentals. I hope the series was useful and provided you with a good mental model of these OS topics. Next week there’s one more post on memory usage figures, and then it’s time for a change of air. Maybe some Web 2.0 gossip or something. Sphere: Related Content

How The Kernel Manages Your Memory

After examining the virtual address layout of a process, we turn to the kernel and its mechanisms for managing user memory. Here is gonzo again:
mm_struct
Linux processes are implemented in the kernel as instances of task_struct, the process descriptor. The mm field in task_struct points to the memory descriptor, mm_struct, which is an executive summary of a program’s memory. It stores the start and end of memory segments as shown above, the number of physical memory pages used by the process (rss stands for Resident Set Size), the amount of virtual address space used, and other tidbits. Within the memory descriptor we also find the two work horses for managing program memory: the set of virtual memory areas and the page tables. Gonzo’s memory areas are shown below:
Memory Descriptor and Memory Areas
Each virtual memory area (VMA is a contiguous range of virtual addresses; these areas never overlap. An instance of vm_area_struct fully describes a memory area, including its start and end addresses, flags to determine access rights and behaviors, and the vm_file field to specify which file is being mapped by the area, if any. A VMA that does not map a file is anonymous. Each memory segment above (e.g., heap, stack) corresponds to a single VMA, with the exception of the memory mapping segment. This is not a requirement, though it is usual in x86 machines. VMAs do not care which segment they are in.

A program’s VMAs are stored in its memory descriptor both as a linked list in the mmap field, ordered by starting virtual address, and as a red-black tree rooted at the mm_rb field. The red-black tree allows the kernel to search quickly for the memory area covering a given virtual address. When you read file /proc/pid_of_process/maps, the kernel is simply going through the linked list of VMAs for the process and printing each one.

In Windows, the EPROCESS block is roughly a mix of task_struct and mm_struct. The Windows analog to a VMA is the Virtual Address Descriptor (VAD); they are stored in an AVL tree. You know what the funniest thing about Windows and Linux is? It’s the little differences.

The 4GB virtual address space is divided into pages. x86 processors in 32-bit mode support page sizes of 4KB, 2MB, and 4MB. Both Linux and Windows map the user portion of the virtual address space using 4KB pages. Bytes 0-4095 fall in page 0, bytes 4096-8191 fall in page 1, and so on. The size of a VMA must be a multiple of page size. Here’s 3GB of user space in 4KB pages:
Paged Virtual Space
The processor consults page tables to translate a virtual address into a physical memory address. Each process has its own set of page tables; whenever a process switch occurs, page tables for user space are switched as well. Linux stores a pointer to a process’ page tables in the pgd field of the memory descriptor. To each virtual page there corresponds one page table entry (PTE) in the page tables, which in regular x86 paging is a simple 4-byte record shown below:
x86 Page Table Entry 4KB
Linux has functions to read and set each flag in a PTE. Bit P tells the processor whether the virtual page is present in physical memory. If clear (equal to 0), accessing the page triggers a page fault. Keep in mind that when this bit is zero, the kernel can do whatever it pleases with the remaining fields. The R/W flag stands for read/write; if clear, the page is read-only. Flag U/S stands for user/supervisor; if clear, then the page can only be accessed by the kernel. These flags are used to implement the read-only memory and protected kernel space we saw before.

Bits D and A are for dirty and accessed. A dirty page has had a write, while an accessed page has had a write or read. Both flags are sticky: the processor only sets them, they must be cleared by the kernel. Finally, the PTE stores the starting physical address that corresponds to this page, aligned to 4KB. This naive-looking field is the source of some pain, for it limits addressable physical memory to 4 GB. The other PTE fields are for another day, as is Physical Address Extension.

A virtual page is the unit of memory protection because all of its bytes share the U/S and R/W flags. However, the same physical memory could be mapped by different pages, possibly with different protection flags. Notice that execute permissions are nowhere to be seen in the PTE. This is why classic x86 paging allows code on the stack to be executed, making it easier to exploit stack buffer overflows (it’s still possible to exploit non-executable stacks using return-to-libc and other techniques). This lack of a PTE no-execute flag illustrates a broader fact: permission flags in a VMA may or may not translate cleanly into hardware protection. The kernel does what it can, but ultimately the architecture limits what is possible.

Virtual memory doesn’t store anything, it simply maps a program’s address space onto the underlying physical memory, which is accessed by the processor as a large block called the physical address space. While memory operations on the bus are somewhat involved, we can ignore that here and assume that physical addresses range from zero to the top of available memory in one-byte increments. This physical address space is broken down by the kernel into page frames. The processor doesn’t know or care about frames, yet they are crucial to the kernel because the page frame is the unit of physical memory management. Both Linux and Windows use 4KB page frames in 32-bit mode; here is an example of a machine with 2GB of RAM:
Physical Address Space
In Linux each page frame is tracked by a descriptor and several flags. Together these descriptors track the entire physical memory in the computer; the precise state of each page frame is always known. Physical memory is managed with the buddy memory allocation technique, hence a page frame is free if it’s available for allocation via the buddy system. An allocated page frame might be anonymous, holding program data, or it might be in the page cache, holding data stored in a file or block device. There are other exotic page frame uses, but leave them alone for now. Windows has an analogous Page Frame Number (PFN) database to track physical memory.

Let’s put together virtual memory areas, page table entries and page frames to understand how this all works. Below is an example of a user heap:
Heap Mapped
Blue rectangles represent pages in the VMA range, while arrows represent page table entries mapping pages onto page frames. Some virtual pages lack arrows; this means their corresponding PTEs have the Present flag clear. This could be because the pages have never been touched or because their contents have been swapped out. In either case access to these pages will lead to page faults, even though they are within the VMA. It may seem strange for the VMA and the page tables to disagree, yet this often happens.

A VMA is like a contract between your program and the kernel. You ask for something to be done (memory allocated, a file mapped, etc.), the kernel says “sure”, and it creates or updates the appropriate VMA. But it does not actually honor the request right away, it waits until a page fault happens to do real work. The kernel is a lazy, deceitful sack of scum; this is the fundamental principle of virtual memory. It applies in most situations, some familiar and some surprising, but the rule is that VMAs record what has been agreed upon, while PTEs reflect what has actually been done by the lazy kernel. These two data structures together manage a program’s memory; both play a role in resolving page faults, freeing memory, swapping memory out, and so on. Let’s take the simple case of memory allocation:
Heap Allocation
When the program asks for more memory via the brk() system call, the kernel simply updates the heap VMA and calls it good. No page frames are actually allocated at this point and the new pages are not present in physical memory. Once the program tries to access the pages, the processor page faults and do_page_fault() is called. It searches for the VMA covering the faulted virtual address using find_vma(). If found, the permissions on the VMA are also checked against the attempted access (read or write). If there’s no suitable VMA, no contract covers the attempted memory access and the process is punished by Segmentation Fault.

When a VMA is found the kernel must handle the fault by looking at the PTE contents and the type of VMA. In our case, the PTE shows the page is not present. In fact, our PTE is completely blank (all zeros), which in Linux means the virtual page has never been mapped. Since this is an anonymous VMA, we have a purely RAM affair that must be handled by do_anonymous_page(), which allocates a page frame and makes a PTE to map the faulted virtual page onto the freshly allocated frame.

Things could have been different. The PTE for a swapped out page, for example, has 0 in the Present flag but is not blank. Instead, it stores the swap location holding the page contents, which must be read from disk and loaded into a page frame by do_swap_page() in what is called a major fault.

This concludes the first half of our tour through the kernel’s user memory management. In the next post, we’ll throw files into the mix to build a complete picture of memory fundamentals, including consequences for performance. Sphere: Related Content

10/1/11

How to run Linux in a virtual machine

by Graham Morrison | June 28th 2010

Guide: Test new releases without harming your existing installation

Try any distro: Step by step
Virtualisation doesn't have to be scary. It isn't the sole domain of the enterprise, or cloud computing, or server farms. It's just as useful, and just as manageable, as the average desktop, and there now seem to be almost as many ways to virtualise Linux as their are distributions themselves.

You could pay money, for example, and buy a workstation product from either VMware or Parallels, both of which have excellent performance, support and some advanced features. Or you could try their open source equivalents, the wonderful VirtualBox and Qemu.
We used the command line to install KVM, but you may find it easier to try the Add/Remove software app first
But there's another alternative, and it can offer the most transparent virtualisation integration into your current configuration, making it an ideal way to experiment with new distributions, and put them to the test.

This is KVM, the kernel-based virtual machine. These three letters might once have scared you off with stories of complexity and VNC sessions, but thanks to a wonderful Red Hat project called Virt-Manager, almost anyone with the right hardware can install KVM and get their own virtual machines running in no time at all.

And virtual machines really are the best way to experiment with the plethora of Linux distributions there are on offer. They're non-destructive, easy to set up, and almost as fast as the real thing to use. They're the best way to get a feel for a distribution without committing real hardware to the installation process, and virtualisation enables you to race through as many installations as your broadband connection allows.

And like all great journeys, it starts with the first step…

Step 1: Check your hardware compatibility

Before we go any further, you need to make sure that your system is up to the task of running other operating systems within a virtual machine. As a general rule, any machine from the last three years should work, but there are some system specifics you should look for.
Most importantly, your CPU should offer what's known as a processor extension for virtualisation. All the modern virtualisation solutions use this to dramatically enhance the performance of the virtualised machine, although older applications such as VMware Player will still work if you don't happen to have a CPU with the correct extension.

Which extension you should look for depends on which brand of processor you're using. Intel users should look for VT-x, while AMD users should look for AMD-V, for example.

You can check your CPU for compatibility by opening a terminal and typing cat/proc/cpuinfo. This will list everything, and if you've got more than one core, more than one processor, or hyperthreading enabled, you see the list repeated for every instance.

Just make sure there's either a vmx or an svm entry in the flags section for any of these cores. The former is Intel's name for the virtualisation extension, while the latter is AMD's equivalent.

Fiddle with the BIOS

If neither appears and you think your machine should be capable, then virtualisation may be turned off in your system BIOS. Getting into the BIOS involves restarting your screen and pressing a specific key as your system's post messages appear, before the Grub menu. This key is usually Delete, F2 or, occasionally, F10.

The location of the setting is also dependent on your specific BIOS, but you should find the entry under either an Integrated Peripherals section, or within the Security menus.

Finally, you'll need to make sure you have enough storage space and memory. A virtual machine eats real resources, which means you'll need to allocate memory and storage space to each virtual instance while leaving enough over for your native operating system.

Linux distributions generally work well with between 512MB and 1GB of memory assigned to them, so you'll ideally need 1GB as a minimum, and ideally, 2GB or more.

It's the same with storage. A standard installation normally takes around 5GB as a minimum (depending on the distribution of course; if you're running a mini-distro such as Puppy Linux you'll need less than this), but if you're going to use a virtual machine for real work, you'll need to allocate more space.

Step 2: Use the right base distribution

We've chosen Fedora 12 because it has the best implementation of Virt-Manager, and there's no reason to doubt that Fedora 13 will be equally as adept.

Fedora is easy to install and provides a top-class distribution. Virt-Manager is the application used to manage both Xen and KVM virtual machines on your system, and without it, these instructions would be a lot harder to follow.
This is because Virt-Manager turns what can be a very complicated setup procedure into a few simple clicks of the mouse. If you've ever created a virtual machine with the commercial VMware or Parallels Workstation, you'll find the job just as easy with Virt-Manager.

The discovery that Fedora is the best distribution to use if you want to play with Virt-Manager is no surprise considering it was developed by Red Hat, but what is surprising is that there aren't more distributions taking this open source project and making it an integral part of their virtualisation strategy, because recent versions of the application are so good.

Even Ubuntu, a distribution which has thrown its cards in with KVM as part of its campaign to push cloud computing with Eucalyptus, only manages to bundle an older version of Virt-Manager in its package repositories, and this older version is severely crippled in terms of features and usability.

Hopefully, the imminent 10.04 should address this problem, and Ubuntu users will soon be able to install a recent version of Virt-Manager without too much difficulty.

Step 3: Install the virtualisation software

The hardest step in the whole of this process is probably installing the specific packages required for virtualisation, simply because you get the best results from the commandline, which many people seem to be naturally allergic to.

But don't let that put you off if you've never used it before – we're only entering a line or two of text, and it shouldn't cause any problems. It's just the way Fedora works best.
Quite a few packages are required to get virtualisation up and running with a default Fedora 12 installation. You could use the package manger, launched by clicking on Add/ Remove Software within the System > Administration menu, but we had trouble tracking down the KVM package, and had better luck using Yum on the command line.

From the command line, launched from the Applications -> System Tools menu, type su followed by your user password. To install packages, type yum install followed by the packages you wish to install.

Here's the line we used: yum install kvm virt-manager libvirt
You should find that there are plenty of other packages that need to be installed as dependencies, and these will be grabbed automatically.

After installation, you can either restart your system or type /etc/init.d/libvirtd start (or use the service command) to start the virtualisation management process.

After that, you're ready to dive into the Virt-Manager application.

Step 4: Fire up Virt-Manager

Virt-Manager can be found by clicking on System Tools > Virtual Machine Manager. You'll need to enter your root password to be able to use the application, but there isn't much to see when it first runs. There should be just a single connection listed in the main window as 'localhost (Qemu)'.

In Virt-Manager terminology, a connection allows it to manage the virtualisation, and they can be on remote machines as well as local. Localhost is your current machine, and Qemu is the virtualisation technology that the connection is using.
The reason why Qemu is listed instead of KVM is because KVM needs Qemu to provide access to the standard emulation elements required to run a virtual machine, such as the BIOS emulation. Qemu then uses KVM to access the virtualised parts.

If it doesn't appear on the list, or if you want to create a new connection, click on Add Connection from the File menu and choose Qemu/KVM from the drop-down Hypervisor list. You could also choose Xen if it's installed and you feel like experimenting with a different technology.

With older versions of Virt-Manager, you would now have to manually create a shared storage device by right-clicking on the connection and choosing the storage device. New versions handle this automatically from the instance-creation wizard, which is the next step.

Step 5: Create a new virtual machine

Click on the Play icon in the top-left, then give your creation a name. If you'll be running several different distributions, it's a good idea to name the virtual machine after the distribution you'll be running. Also, make sure you select Local Install Media, as this is how we're going to install the distribution.

On the following screen, select Use ISO Image and use the Browse button to navigate to the location of your ISO. You'll have to click on the Browse Local button to jump from the virtual storage to your home directory.
Virtual storage is the space that Virt-Manager uses to store its own virtual drives. Choose Linux as the OS Type in the drop-down menu below the ISO location. To ensure the best possible compatibility, try to choose a distribution with the closest match to the distro you want to try. For distributions like Mint or Crunchbang, for example, you could select Ubuntu 9.10.

On the next page, you need to select how much RAM you want to assign to the virtual machine. Minimal distributions like Dreamlinux could get away with 512MB or lower, but a modern Gnome or KDE-based distribution would ideally need 768MB and higher. The more memory your virtual machine has, the better it will perform.

Finally, on the next page, make sure to select Enable Storage, and click on Create A Disk Image. If you've got enough disk space, increase 8GB accordingly. Leave the final page at its default settings and click on Finish.

Step 6: Boot your new distribution

After clicking on Finish you'll find that Virt-Manager spontaneously launches into the boot process for your chosen distribution. A few moments after that, you will see exactly the same boot routine you'd expect if you were booting off a real drive with a physical disk. This all means that everything is working exactly as it should, and you've been able to successfully create and run your first virtual machine.

What happens next depends completely on the distribution you've chosen to run. Linux Mint, for example, will present a fully functional desktop, whereas other distributions may ask you to walk through the installation routine. Either way, you can get full control of the virtual machine by clicking within the window.
KVM will then take over your mouse and keyboard. You should see a small notification window telling you that the pointer has been grabbed and letting you know the key combination you need to use if you want to escape from your virtual machine back to the native environment.

This key combination is normally the left Ctrl and Alt keys, which has become something of a standard with virtualisation software. Pressing these together should give control back to your normal desktop.

If you ever need to send this specific key combination to the virtual machine you can use the Send Key menu, which lists all the various combinations you might want to use, such as Ctrl+Alt+F1 for the first virtual console view, or Ctrl+Alt+Backspace to restart the X server.

Try any distro: Get more from Virt-Manager
Because there aren't any buttons on the front of your virtual machine's beige box, you have to shut down, restart and power off your virtual machine from within the software. These functions can be found by either right-clicking on the running machine, or from the drop-down menu in the toolbar. Depending on the virtualised distribution, both shutdown and restart options should be safe to use.

This is because KVM sends the request to the virtual operating system, and this should handle it in exactly the same way as it would the same option being selected from the shutdown menu in Gnome, or a quick press of the power button in a system that responds correctly to ACPI messages. This means that you'll be warned of the impending shutdown and you'll have a chance to respond to any applications that are still open and save any files.

This won't happen if you choose 'Force Off' from the shutdown menu, as this is the virtual equivalent to pulling the power cable out of the wall. In this case, anything that's not saved to the virtual storage device will be lost.

You may also have noticed a Pause button in the Virt-Manager toolbar. This will instantly stop the virtual machine, which can be resumed from the same point by pressing the Pause button again. But unlike the same functionality in VMware, a paused system will not survive a reboot and you'll lose data held in running applications that haven't saved their state.

Danger – virtualisation ahead

Another important thing to realise is that just because your data is virtual and that there's no power cable to each machine, your work is even more fragile within a virtual environment than it is on your normal desktop. This is because there are more things that can go wrong, and your data is usually harder to get at should you wish to restore it.

It isn't a problem if you manage your data appropriately, but it's something you should be aware of if you start spending a lot of time within a virtual machine.

After running the virtual machine for the first time, you may wonder how you change the disc image to point to another ISO file, or even get back to the same information you saw when you first ran the machine. This configuration panel is accessed from the view panel of the virtual machine you wish to change, and you need to make sure that the machine isn't running to be able to change settings safely.

Just click on View -> Details to enter the editor. You will then see a window that offers a comprehensive overview of the virtual hardware being emulated for your machine. Click on IDE CDROM 1, for example, followed by Connect on the panel to the right, and you'll be able to select a new CD/DVD image for your virtual machine. Click on Memory and you can adjust the amount of memory assigned to the machine.

This is handy if you either under-or overestimated how your virtual machine might perform when you ran through the startup wizard.

Check your hardware

You might also want to look at the graphics hardware being emulated. This is found on the Display page, and by default, it's something called 'cirrus'.

The Cirrus Logic chipset that this emulates is one of the most common and broadly compatible chipsets available, with excellent support across a wide range of operating systems. It's perfect for running old distros, MS DOS and even Windows, for example, but it's not the fastest driver, and if you're going to be spending a lot of time in your virtual distribution, it might be worth switching to 'vmvga' in the model list.

This is a close match for VMware's own graphics driver, and is better suited to virtualisation. If your virtualised distribution is able to use an implementation of VMware's open source graphics driver, you should find this option performs better on your system. If not, you can always switch back.

Recent versions of Virt-Manger can also be made to scale the resolution of the virtual machine's display to the size of the window. Just enable the Scale To Display > Always option in the View menu. If you have a virtual screen resolution higher than your host machine's, you will need enable this option or you'll have to manually scroll around the display, which could get a bit wearing.

Another neat function is that your virtual machines are also available through VNC, the remote desktop protocol. To get started with this, take a look at the Display VNC page in the settings viewer.

When your virtual machine is running, you'll see a port listed for the service. You will then be able to access the desktop of your virtual machine using a VNC client, such as Vinagre on Gnome or KFM on KDE. For a client running on the same machine, just point it at localhost:5900 for the first virtual machine. Change the port number to the one listed in the details view if this doesn't work, and you'll see the same desktop session displayed within the Virt- Manager virtual machine view.

This has all kinds of potential uses, from accessing your virtual machines from a remote computer somewhere out in the wilds of the internet to duplicating the desktop for use as a live demonstration with a projector.

Advanced features

You might also have guessed from the Virt-Manager GUI design that you can run as many separate virtual machines as your system will allow. The only real limitation is physical memory, because this is likely to be the weakest link in your system.

When each machine is running simultaneously, you will need enough memory to cater to the specific requirement of each. With 4GB of RAM, for instance, you can run three virtual machines alongside your normal desktop if each were given 1GB of RAM, and you can check the performance of each instance using the CPU meter to the right of each entry in the virtual machine list.

If you need more information on each machine's memory usage, disk throughput and network bandwidth, take a look at the Performance page of the Details window. One of KVM's more advanced features is its ability to access your real hardware through the Physical Host Device functionality.

But before you set your expectations too high, this doesn't mean you can pass your latest high-powered Nvidia graphics card or audio device through to the virtual machine – these are far too complex to work. But you should be able to get most network adaptors to function as well as many USB storage devices.

To get these to work, open the Details window from the virtual machine view and click on the Add Hardware button at the bottom of the list on the left. From the window that appears, select Physical Host Device from the drop-down menu, click on Forward and select the device from the Type and Device lists that appear. Use the Type menu to switch between PCI and USB buses, and the Device list to choose the specific device from the list that appears.

This facility is a little experimental, but you may find that many simpler devices should work without any further modification

Other ways to experiment

Virtualisation isn't the only way to try a new distribution – it just happens to be the most unobtrusive and easiest to use. But if you want to take your testing a little more seriously, running a new release on your own hardware without messing up your primary installation, there are several techniques that can help make the process easier.

Unlike a couple of commercial operating systems we could mention, most modern Linux distributions will quite happily install themselves alongside other operating systems and distributions, automatically adding their boot options to a boot menu.

To be able to do this you will need to make sure you have enough space on your hard drive. This is where things can get tricky, because the first distribution you install on your system will want to use all the available space, making further installations harder to achieve.

It's for this reason that you can save yourself a lot of hassle by forcing the first distribution you install to only use a certain amount of space, and to do this you need to use the manual installation tool.

Go compare

Different distributions use different tools to manage the partitioning process, but they all share the same basic functionality. You get to choose between an automatic installation and a manual installation. The former will usually blank your hard drive or use any remaining space in its entirety, while the latter option requires a bit more know-how.

You'll need a minimum of two partitions: one to hold all the files required by your distribution, and another for what's known as swap space. If there's enough free space on the drive, you'll able to create a new partition and define exactly how much space you want it to occupy.

You will also need to select a filesystem for the partition, and most people will use either ext4 or ext3, unless you have any special requirements

The swap partition is an area of the hard drive used as an overflow space for data shuffled from your RAM. The rule of thumb used to be that the swap partition should be twice the size of your RAM, with 2GB as an upper limit.

After creating both partitions, you will need to set a mount point for each. The main partition needs to be set to / for root, while the swap space is normally listed as linux-swap. Both will need to be formatted if your installer provided the option, and you can now let the installation progress as normal.

You'll need to go through a similar routine with any further distributions you want to install, using any remaining space to create the partitions you need for each installation.

Resize with GParted

If you're already running a distribution and you want to resize your current partition to make space for a new installation, then GParted, the tool used by most installers, can do the job – with a couple of caveats.

We've had the best results with GParted by booting from a live CD that includes the application, such as Ubuntu. This should give it complete access to your drives and enable you to resize partitions without having to worry about data access.
Resizing is then a simple matter of selecting the partition you want to shrink (or enlarge) and clicking on the Resize button. From the window that appears, drag either the left or right border of the partition to shrink the space it occupies on the drive.

After you've mastered the art of manual partitioning, there's another useful aspect to taking control of your data, and that's creating a separate home partition. You just need to create another partition alongside root and swap, give it a filesystem and assign it a mount point of /home.

Most installations will let you choose an existing partition to use as home and won't require it to be reformatted as part of the process. This means that any user accounts, along with their data, will appear intact and accessible from the moment you boot into the new distribution, which is especially useful if your time happens to be spread across more than one distro.

We would recommend creating a separate user for each distribution. This avoids any potential overlap in home directories and configuration files when you create a user account that already exists in a different distribution.

If you want to port your settings from one account to another, you can do that manually with the command line after you get the desktop up and running. Just copy the entire contents of one home directory to another using the cp -rf source destination command and make sure the owner and group permissions are adjusted to reflect the user who's going to access the new directory (try chmod -R username:username directory).

Even if you can't work with manual partitions, there's still the easiest option of all, and that's the humble live CD. Many distributions now include their own bootable versions, enabling you to test a distribution for hardware compatibility as well as for its design and usability. You can get a very good idea of what the final installation may feel like from a distribution running off a CD, even if the slow data read times from the optical disc may things a little slow.

Distro on a stick

This speed issue can be solved by installing any distribution you may want to try on to a spare USB thumbdrive and booting from this when you restart your machine. Creating a USB-based distribution like this used to be a chore, but thanks to the Unetbootin tool, you can create a bootable stick for all the most common distributions with just a few clicks of the mouse.
Most distributions will include the Unetbootin package, and the application itself will likely need administrator privileges. When it's running, just point it at the location of your distribution's ISO and select the distribution version from the drop-down lists, followed by the location of your memory stick.

Any PC from the last 3–5 years should be able to boot from the USB drive without any further interaction, but sometimes you may need to enter a boot menu from the BIOS, or change the boot order in the BIOS itself.

The end result will be a distro running from external storage with almost the same speed of a native distribution.
---------------------------------------------------------------------------------------------------
First published in Linux Format Issue 132

Liked this? Then check out 8 of the best tiny Linux distros Sphere: Related Content

29/11/10

A history of viruses on Linux

Brandon Boyce
27 November 2010 - 17:44

We recently gave you a brief history of viruses on the Mac and as requested by a user we wanted to give you a history of viruses on Linux. Given the tight security integrated into Linux, it is difficult to take advantage of a vulnerability on the computer, but some programmers have found ways around the security measures. There are several free options for anti-virus on Linux that you really should use, even if it isn't always running - a weekly or monthly scan doesn't hurt. Free anti-virus solutions include: ClamAV, AVG, Avast and F-Prot.

1996:
The cracker group VLAD wrote the first Linux virus named Staog. The virus took advantage of a flaw in the Kernel that allowed it to stay resident on the machine and wait for a binary file to be executed. Once executed the virus would attach itself to that file. Shortly after the virus was discovered the flaw was fixed and the virus quickly became extinct. VLAD was also responsible for writing the first known virus for Windows 95, Boza.

1997:
The Bliss computer virus made its way out into the wild. The virus would attach itself to executables on the system and prevent them from running. A user had to have root access for the virus to be affected, and to this day Debian lists itself as still being vulnerable to this virus. The threat to Debian is minimal though as users do not typically run as root.

1999:
No significant viruses were reported this year but oddly enough a hoax message went around stating there was a virus that was threatening to install Linux on your computer. At the time the Melissa virus was ravaging PCs worldwide and on April 1, 1999 (April Fools Day) a message went out warning that a virus named Tuxissa was running about secretly installing Linux on unsuspecting computers.

2000:
A rather harmless virus, Virus.Linux.Winter.341, showed up and inserted itself into ELF files; ELF files are executable Linux files. The virus was very small, only 341 bytes, and would insert LoTek by Wintermute into the Notes section of an ELF file. The virus was also supposed to change the computer name to Wintermute but never gained control of a machine to effect the change.

2001:
This was an eventful year for Linux viruses; the first was the ZipWorm, a harmless virus that would simply attach itself to any zip files located in the same directory it was executed in. Next was the Satyr virus which was also a harmless virus, it would simply attach itself to ELF files adding the string unix.satyr version 1.0 (c)oded jan-2001 by Shitdown [MIONS], http://shitdown.sf.**(edited as URL causes Avast to block page). There was also a virus released called Ramen which would replace index.html files with their own version displaying Ramen Crew at the top and a package of Ramen Noodles at the bottom. Later a worm by the name of Cheese came out that actually closed the backdoors created by the Ramen virus. There were several other viruses released this year that were relatively harmless.

2002:
A vulnerability in Apache led to the creation and spread of the Mighty worm. The worm would exploit a vulnerability in Apache's SSL interface, then infect the unsuspecting victims computer. Once on the computer it would create a secret connection to an IRC server and join a channel to wait for commands to be sent to it.

2003:
Another harmless virus showed up, it was called the Rike virus. The virus, which was written in assembly language, would attach it self to an ELF file. Once attached it would expand the space the file required and write RIKE into that free space.

2004:
Similar to the virus from the previous year, the Binom virus would simply expand the size of the file and write the string [ Cyneox/DCA in to the free space. The virus was spread by executing an infected file.

2005:
The Lupper worm began spreading to vulnerable Linux web servers. The worm would hit a web server looking for a specific URL, then it would attempt to exploit a vulnerable PHP/CGI script. If the server then allowed remote shell command execution and file downloads, it would become infected and begin searching for another server to infect.

2006:
A variant of the Mighty worm from 2002 named Kaiten was born. It would open a connection to an IRC channel and wait for commands to be sent and executed.

2007:
An exploit in OpenOffice led to the spread of a virus named BadBunny. This virus would infect Windows, Mac and Linux machines. The virus creates a file called badbunny.py as an XChat script and creates badbunny.pl, a Perl virus infecting other Perl files. There was also a trojan horse released by the name of Rexob. Once on the machine, it would open a backdoor allowing remote code execution.

2009:
A website for GNOME users to download screensavers and other pieces of eye-candy unknowingly hosted a malicious screen saver called WaterFall. Once installed on the machine it would open up a backdoor that when executed would cause the machine to assist in a distributed denial of service attack (DDOS). The DDOS attack was very specific and targeted a specific website, MMOwned.com.

2010:
The koobface virus, a virus that spreads through social networking sites targets Windows, Mac and, in a more recent variant, Linux computers. Once infected, the virus attempts to gather login information for FTP and social networking sites. Once your password has been compromised the virus will send an infected message to all of your friends in your social network.

This is by no means a complete list of Linux viruses but it does cover the major ones. It also points out that most of the viruses found on Linux are fairly harmless. That doesn't mean they don't exist though. Be sure to keep an eye on what your downloading and where you're going on the Internet and you will most likely stay virus free. An occasional virus scan wouldn't hurt either.

Sources:
hackinglibrary.ws
wikipedia.org
irc-security.de
securelist.com
f-secure.com
cnet.com
techrepublic.com
lwn.net
crenk.com Sphere: Related Content