Skip to Content

MaxDB – Space oddities

Recently a customer came up with the assumption that there is space somehow missing from the database.
Although he deleted a whole lot of data via archiving the data area was still quite filled up.
To investigate he decided to check whether the information visible in DBM GUI are correct.

As some of you may know, DBM GUI simply runs dbm- and SQL-commands in the background to gather the information.
Via “Instance -> Show Console” these commands can be revealed.

Getting the data 

To gather the filling level of the database the command in charge is “info_state“.
Running this command manually in dbmcli brings up the following information:

dbmcli on 760c>info state
OK
END
Name                    | Value
Data           (KB)     = 3248
Perm Data      (KB)     = 3136
Temp Data      (KB)     = 112
Data        (Pages)     = 406
Perm Data   (Pages)     = 392
Temp Data   (Pages)     = 14
Data            (%)     = 6
Perm Data       (%)     = 6
Temp Data       (%)     = 0
Log            (KB)     = 8
Log         (Pages)     = 1
Log             (%)     = 0
Sessions                = 2
Sessions        (%)     = 10
Data Cache      (%)     = 96
Converter Cache (%)     = 0
Data Max       (KB)     = 51184
Log Max        (KB)     = 22448
Data Max    (Pages)     = 6398
Log Max     (Pages)     = 2806
Sessions Max            = 20
Database Full           = No
Connect Possible        = Yes
Command Monitoring      = Off
Database Monitoring     = On
Kernel Trace            = Off
Autosave                = Off
Bad Indexes             = 0
Log Full                = No
Bad Devspaces           = 0
Data Cache OMS      (%) = 0
Data Cache OMS Log  (%) = 0

If we investigate a bit deeper, we’ll find that this command actually is a predefined SQL command that queries a view called “SYSDBM”.”INFO_STATE” which is defined in the DBMVIEWS.py  file as:

    createReplaceView (session, 'INFO_STATE',"""
    (
        "Data           (KB)",
        "Perm Data      (KB)",
        "Temp Data      (KB)",
        "Data        (Pages)",
        "Perm Data   (Pages)",
        "Temp Data   (Pages)",
        "Data            (%)",
        "Perm Data       (%)",
        "Temp Data       (%)",
        "Log            (KB)",
        "Log         (Pages)",
        "Log             (%)",
        "Sessions",
        "Sessions        (%)",
        "Data Cache      (%)",
        "Converter Cache (%)",
        "Data Max       (KB)",
        "Log Max        (KB)",
        "Data Max    (Pages)",
        "Log Max     (Pages)",
        "Sessions Max",
        "Database Full",
        "Connect Possible",
        "Command Monitoring",
        "Database Monitoring",
        "Kernel Trace",
        "Autosave",
        "Bad Indexes",
        "Log Full",
        "Bad Devspaces",
        "Data Cache OMS      (%)",
        "Data Cache OMS Log  (%)"
    )
    AS SELECT
        A.USEDSIZE,
        A.USEDSIZE - A.USEDTEMPORARYSIZE,
        A.USEDTEMPORARYSIZE,
        FIXED(A.USEDSIZE/D.PAGESIZE_IN_KB),
        FIXED((A.USEDSIZE - A.USEDTEMPORARYSIZE)/D.PAGESIZE_IN_KB),
        FIXED(A.USEDTEMPORARYSIZE/D.PAGESIZE_IN_KB),
        FIXED(A.USEDSIZE/A.USABLESIZE*100, 3),
        FIXED((A.USEDSIZE - A.USEDTEMPORARYSIZE)/A.USABLESIZE*100, 3),
        FIXED(A.USEDTEMPORARYSIZE/A.USABLESIZE*100, 3),
        S.USED_LOG_PAGES*D.PAGESIZE_IN_KB,
        S.USED_LOG_PAGES,
        S.PCT_LOGUSED,
        D.ACTIVE_SESSIONS,
        FIXED(D.ACTIVE_SESSIONS/D.MAXUSERS*100, 3),
        D.DATACACHE_HITRATE,
        0,
        A.USABLESIZE,
        S.LOG_PAGES*D.PAGESIZE_IN_KB,
        FIXED(A.USABLESIZE/D.PAGESIZE_IN_KB),
        S.LOG_PAGES,
        D.MAXUSERS,
        D.DATABASEFULL,
        D.CONNECTPOSSIBLE,
        D.DIAGMONITOR,
        D.MONITOR,
        D.KNLTRACE,
        D.AUTOSAVESTANDBY,
        D.BADINDEXES,
        D.LOGFULL,
        D.BADVOLUMES,
        D.DATA_OMS_HITRATE,
        D.LOG_OMS_HITRATE
    FROM
  SYSDD.DATASTATISTICS A,
        SYSDD.SERVERDB_STATS S,
        SYSDD.DBM_STATE D
    INTERNAL""")

Easy to see: the storage related data are taken out of “SYSDD”.”DATASTATISTICS“.
As the SYSDD tables are reserved for internal access we can simply use the view “SYSINFO“.”DATASTATISTICS” which is available in the database catalog.
(note: there are some internal schemas in MaxDB that are usually hidden from users. “SYSDD” and “SYSDBM” are just two of them.).

So let’s check the “DATASTATISTICS” on my testsystem:

select 'STAT' as source, USEDSIZE used_kb, round(usedsize/1024, 2) used_MB 
from datastatistics
SOURCE USED_KB USED_MB
------------------------ 
STAT    467208  456.26
=======================

 

So far nothing impressing – there are about 452 MB used up in the data area.

Check the freespace accounting 

Now the customer had his doubts and wanted to check against this value.
So the idea was (obviously):
Count all tables and indexes in the database and what is allocated for them and this should match the information from “DATASTATISTICS”.

The statement used looked like this:

select 'FILES' as source, 
  sum(treeleavessize+treeindexsize+lobsize) as used_kb
 ,round(sum(treeleavessize+treeindexsize+lobsize)/1024,2) as used_MB
from files
SOURCE USED_KB USED_MB
-----------------------

FILES   469672  458.66
=======================

Comparing the two results shows quite some differences:

SOURCE USED_KB USED_MB
----------------------- 
STAT    467208  456.26
FILES   469672  458.66
-----------------------
DELTA    -4224   -4,13
=======================

So where are the 4,13 MB in my system?

In the case of the customer the difference was in the range of several hundred MB.

Where is my freespace? 

The answer is simple but a bit surprising.

It’s about how the FILES systemtable works.
For each internal FILE (or SEGMENTS for the Oracle folks that read this) there is one row in this table.
Anyhow, the column LOBSIZE is kind of special.

Here’s why:
MaxDB stores the data of LONG/LOB columns in separate FILES.
So for each table with LONG/LOB columns we find one row for the table and one row for each LONG/LOB column in the FILES systemtable.
If the FILES.TYPE is ‘TABLE’ then the LOBSIZE is the sum of allocated space for all the dependen LONG/LOB columns of this table.

Basically the value is internally created like this:

select f.fileid, sum(lf.treeindexsize+lf.treeleavessize) LOBSIZE
from
files f, files lf
where
    f.fileid=lf.primaryfileid
and f.type='TABLE'
and lf.type ='SHORT COLUMN FILE'
group by f.fileid

Getting the data the right way 

So with the initial statement to check the used size we actually counted the LONG/LOB files twice.
Therefore a good approximation of the used space would be this adapted version of the statement where we just exlude the FILE-entries for those LONG/LOB columns:

select 'STAT' as source, USEDSIZE used_kb, round(usedsize/1024, 2) used_MB 
from datastatistics
UNION ALL
select 'FILES' as source,
  sum(treeleavessize+treeindexsize+lobsize) as used_kb
 ,round(sum(treeleavessize+treeindexsize+lobsize)/1024,2) as used_MB
from files
where type != 'SHORT COLUMN FILE'
SOURCE USED_KB USED_MB 
------------------------
STAT    467200  456.25
FILES   466728  455.79
------------------------
DELTA     -472   -0.47
========================

As we see, much less deviation from the “DATASTATISTICS” now.

But still, there is half a MB off for which we can assume that these belong to the CONVERTER.
By checking “IOBUFFERCACHES” we can at least get a rough estimation for the lower limit for the size of the CONVERTER:

SELECT CONVERTERUSEDSIZE FROM "SYSINFO"."IOBUFFERCACHES"
CONVERTERUSEDSIZE  (Pages used by the converter (KB))
------------------


              336
==================

With these information we can be pretty sure that there is no “deadwood” in this database.

‘Deadwood’ issues in MaxDB ?

Should the deviation between the values from “DATASTATISTICS” and “FILES” be much larger, say several 100 MB as it looked initially like for the customer, then it may be time to run a CHECK DATA WITH UPDATE.

This will rebuild the converter and the freeblock management and release pages that had been left over by the garbage collector.

You may find messages like this one in the KNLDIAG after the CHECK DATA WITH UPDATE run:

[...]
Converte The pageno perm (111553) on block 4/8251 is deleted
Converte The pageno perm (111565) on block 2/4608 is deleted
Converte The pageno perm (111568) on block 4/8211 is deleted
Converte The pageno perm (111574) on block 5/1264 is deleted
Converte The pageno perm (111576) on block 4/8207 is deleted
Converte The pageno perm (111587) on block 4/8335 is deleted
Converte The pageno perm (111602) on block 6/5702 is deleted
Converte The pageno perm (111605) on block 5/1503 is deleted
Converte The pageno perm (111617) on block 6/5791 is deleted
Converte The pageno perm (111643) on block 4/8329 is deleted
CHECK    Check data finished successfully
DATACACH Begin data cache shutdown
DATACACH Released cache frames:   2392
DATACACH Finish data cache shutdown
[...]

Of course you may now ask: “Why does the garbage collector forget about those pages?”.

The garbage collector is a forgetful bastard…

The reasons for that can be various.

One of them is that currently, the garbage collector is not able to resume the cleanup of a file when it has been interrupted, say by a restart of the database.

This is of obviously not so nice and I’m sure that this will be changed in the future, but usually it’s not such a big issue, as the database instance are not restarted every now and then.

I hope you liked this rather long and technical blog.

Best regards,
Lars

To report this post you need to login first.

4 Comments

You must be Logged on to comment or reply to a post.

  1. Markus Doehr
    The problem with production databases is, that you can factually never run CHECK DATA WITH UPDATE in admin mode. On our 2,1 TB system with 96 volumes this takes > 50 hours despite the high degree of parallelism. Actually we can´t run any check data because of locks being created during the run.

    But I heard that there´s development undergoing to “fix” that 🙂

    Thank you for that insight – really useful.

    (0) 
    1. Lars Breddemann Post author
      You’re right Markus – the current implementation of the CHECK DATA is not very ‘uptime friendly’.

      And yes, there are quite some ideas on what would be possible to do to improve this.

      Anyhow, personally I wouldn’t bet my grandma’s
      house on the priorization of these improvements …

      Glad you like the blog – let’s see if I can do some more of this kind.

      (0) 
    2. Lars Breddemann Post author
      Ah – nearly forgot…

      Could you just try and run the check statements on some of your systems and email the result to me?
      I’d like to know how common ‘deadwood’ is on production systems.

      Thanks in advance.

      (0) 
      1. Markus Doehr
        Hi Lars,

        here’s the output:

        STAT   2131743144  2081780.41
        FILES  2129403896  2079495.99

        SELECT CONVERTERUSEDSIZE FROM “SYSINFO”.”IOBUFFERCACHES”
        | CONVERTERUSEDSIZE              |
        | —————————— |
        |                        1161912 |

        🙂

        Markus

        (0) 

Leave a Reply