Friday, December 1, 2017

New implementation of clearing recursive intrusive list

There is still memory leak happens at the end of each test case. Urrhhh! Now only I got to remember this is a new implementation of loading up the file path into memory. A new thing. And the clearMemory() is still implementing the old code.
void FileBot::clearMemory()
{
    for(FileNode &T : fileList) {
        T.sibling.erase_and_dispose(T.sibling.begin(), T.sibling.end(), DisposeFileNode());
    }
 
    fileList.erase_and_dispose(fileList.begin(), fileList.end(), DisposeFileNode());
}
In the new implementation, one I can think of is a recursive function. The recursive function usage is like this: As long as there are file objects inside the sibling, dive into the sibling and look for any other file objects inside the sibling. If it doesn't contain any file object, it will return. This return will go back up one level, then only clean the sibling for the particular file object. The process will continue until it reaches to the beginning level of the path, then only remove the remaining memory from container.
void FileBot::clearMemory(FileNode *fileNode)
{
    bool beginning = false;

    if( fileNode == NULL ) {
       fileNode = &*(fileList.begin());
       beginning = true;
    }

    if( fileNode->sibling.size() > 0 ) {
        cout << "scanning current node: " << fileNode << endl;
        clearMemory(&*(fileNode->sibling.begin()));
    }
    else
        return;

    cout << "clean sibling memory: " << fileNode << " sibling size: " << fileNode->sibling.size() << endl;
    fileNode->sibling.erase_and_dispose(fileNode->sibling.begin(), fileNode->sibling.end(), DisposeFileNode());

    if( beginning == true) {
        cout << "clean root. Root size [" << fileList.size() << "]" << endl;
        fileList.erase_and_dispose(fileList.begin(), fileList.end(), DisposeFileNode());
    }
}
Notice the clearMemory(FileNode* ) is expecting an argument. I have declared this argument to NULL when it is first called. Something like this:
class FileBot
{
public:
   void clearMemory(FileNode *fileNode = NULL);

   ...
   ...
};
Thanks to the C++ great feature. When it is NULL, this indicate the beginning of the file path, after the subsequent call to the clearMemory(), it is no longer NULL anymore. And when the sibling size is zero, clearMemory() wouldn't get called. Thus, no worry about that.

In addition to that, there is a memory leak happened on following test case:
/* load 1 level parent path
 *
 */
BOOST_AUTO_TEST_CASE(TL_4)
{
    BOOST_TEST_MESSAGE("TC4 : load 1 level parent path");

#if defined(WIN32)
    BOOST_TEST_MESSAGE("Test path: D:");
    FileBot fb("D:");
#else
    BOOST_TEST_MESSAGE("Test path: /home");
    FileBot fb("/home");
#endif

    string path = "";
    if( fb.initialize() == 1 )
       path = fb.verifyFileList();

#if defined(WIN32)
    BOOST_TEST(path == "D:");
    fb.clearMemory();
#else
    BOOST_TEST(path == "/home");
#endif
}
This is due to something was not being handle probably in clearMemory(), end up the fileList wasn't clear. This test case consists of a file path with only one level, such as C:\ in Windows or /home in Linux. For Linux, it is a little bit special, /home is actually 2 levels file path. One is root path, and home is under the root. For my requirement, I just treat it as 1 level. Unlike Windows, the root is represented by a label, C:\.

I can't think of a perfect solution to resolve this defect yet. For now, I make a validation check at the beginning of the function:
void FileBot::clearMemory(FileNode *fileNode)
{
    bool beginning = false;

    // this parent have no child
    if( (&*(fileList.begin()))->sibling.size() == 0 ) {
        fileList.erase_and_dispose(fileList.begin(), fileList.end(), DisposeFileNode());
        return;
    }

    ...
    ...
}
When I see there is no sibling, clean up the mess and then bail out from the function.

Monday, November 27, 2017

New solution to build parent path

It took me a few weeks to work on this POC with Boost Intrusive. I was thinking to use Boost Intrusive build a directory path. Just like a tree structure, the branches representing the files and folders, spread until the end of the sub directory. Thus the setup for this class is as follows:
class FileNode : public boost::intrusive::list_base_hook<link_mode<auto_unlink> >
{
private:
    boost::filesystem::path name; // file name
    char type; // f is file, d is directory

public:
    boost::intrusive::list< FileNode, base_hook<list_base_hook<link_mode<auto_unlink> > >, constant_time_size<false> > sibling;

};
I created a FileNode class represent the file in a path, and each object of this class can neither be a file or a folder, and the a name too. These are the private member declared in the class. The sibling member is to tell whether they are any sub directory available, if the size is greater than zero, it means there are files in it.

Now come to the main dish. There are 2 part of it; first would be the load the given path into the memory by using the class structure mention above, second is to constructed of the files underneath the given path. For this POC, I'm focusing on the first part, the second part has not yet done.

I have though should I skip the first part for some while? Say when user keyed in /home/path_1, can I just treat the value as a single FileNode? Since I doesn't really care what the value are, it would be easier for me to do the job. Coming from the perfectionism view-point, I think it would be nice to load every single file entity as a separate FileNode object. Meaning to said that /home is one object, and /path_1 is another object.

For this purpose, I create a function to handle this job for me.
FileNode* FileBot::constructParentPath(string path)
{
    vector<string> words;
    FileNode *node = new FileNode();
    string unixFilePath = path;

    boost::replace_last(unixFilePath, "/", "|");
    boost::split(words, unixFilePath, boost::is_any_of("|"), boost::token_compress_on);

    node->setType('d');
    node->setName(words[1]);

    // there is a parent node
    if( words[0] != "" ) {
        FileNode *parent = constructParentPath(words[0]);

        cout << "parent address: " << parent << endl;

        parent->sibling.push_back(*node);
    }
    else {
        fileList.push_back(*node);
        cout << "node address: " << node << " " << node->getName() << endl;
    }

    words.clear();

    return node;
}
This function is as good as it works only on Linux, but failed on Windows. It can't pass the test case in Windows specific path. After many round of rework, then only I figure out a new solution that tested out on both Linux and Windows:
int FileBot::initialize()
{
    // bail out if the path doesn't exists
    if( !exists(fileNode.getName().generic_string()) )
        return 0;

    // construct the parent path first
    string path = fileNode.getName().generic_string();
    FileNode *parent = NULL;

    typedef split_iterator<string::iterator> string_split_iterator;
    for( string_split_iterator it = make_split_iterator(path, first_finder("/", is_equal()));
         it != string_split_iterator();
         ++it)
    {
        FileNode *node = new FileNode();
        node->setType('d');
        node->setName(copy_range<string>(*it));

        cout << "value inserted: [" << copy_range<string>(*it) << "]" << endl;

        if( parent == NULL ) {
            fileList.push_back(*node);
        }
        else
            parent->sibling.push_back(*node);

        parent = node;
        cout << "parent: " << parent;
        cout << " parent value: " << parent->getName();
        cout << " sibling size: " << parent->sibling.size() << endl;
    }

    return 1;
}
The difference between the 2 solutions is that the first one is using the recursive loop and the later one is using for loop. On top of that the second solution is much easier to read than the first. To check my work is being done correctly, I have created following test case for the unit test.
/* load 1 level parent path
 *
 */
BOOST_AUTO_TEST_CASE(TL_4)
{
    BOOST_TEST_MESSAGE("TC4 : load 1 level parent path");

#if defined(WIN32)
    BOOST_TEST_MESSAGE("Test path: D:");
    FileBot fb("D:");
#else
    BOOST_TEST_MESSAGE("Test path: /home");
    FileBot fb("/home");
#endif

    string path = "";
    if( fb.initialize() == 1 )
       path = fb.verifyFileList();

#if defined(WIN32)
    BOOST_TEST(path == "D:");
    fb.clearMemory();
#else
    BOOST_TEST(path == "/home");
#endif
}


/* load 2 level parent path
 *
 */
BOOST_AUTO_TEST_CASE(TL_5)
{
    BOOST_TEST_MESSAGE("TC5 : load 2 level parent path");

#if defined(WIN32)
    BOOST_TEST_MESSAGE("Test path: D:/workspaceqt");
    FileBot fb("D:/workspaceqt");
#else
    BOOST_TEST_MESSAGE("Test path: /home/kokhoe");
    FileBot fb("/home/kokhoe");
#endif

    string path = "";
    if( fb.initialize() == 1 )
       path = fb.verifyFileList();

#if defined(WIN32)
    BOOST_TEST(path == "D:/workspaceqt");
    fb.clearMemory();
#else
    BOOST_TEST(path == "/home/kokhoe");
#endif
}


/* load 3 level parent path
 *
 */
BOOST_AUTO_TEST_CASE(TL_6)
{
    BOOST_TEST_MESSAGE("TC6 : load 3 level parent path");

#if defined(WIN32)
    BOOST_TEST_MESSAGE("Test path: D:/workspaceqt/ui1");
    FileBot fb("D:/workspaceqt/ui1");
#else
    BOOST_TEST_MESSAGE("Test path: /home/kokhoe/workspaceqt");
    FileBot fb("/home/kokhoe/workspaceqt");
#endif

    string path = "";
    if( fb.initialize() == 1 )
       path = fb.verifyFileList();

#if defined(WIN32)
    BOOST_TEST(path == "D:/workspaceqt/ui1");
    fb.clearMemory();
#else
    BOOST_TEST(path == "/home/kokhoe/workspaceqt");
#endif
}

Wednesday, November 8, 2017

Accessing the last element of an Intrusive list

I always thought that retrieving the last element from an intrusive list just as easy as one two three. In fact, it is not. Let's dive into the story. Here I have the intrusive list declaration:
/***** FileNode.h *****/

class FileNode : public boost::intrusive::list_base_hook<link_mode<auto_unlink> >
{
...
...

};


/***** FileBot.h *****/

typedef boost::intrusive::list<FileNode, base_hook<list_base_hook<link_mode<auto_unlink> > >, constant_time_size<false> > FileNodeListType;

class FileBot
{
private:
    FileNodeListType fileList;

...
...
};
And then I'll use the iterator as shown below go straight to access the element located at the end of the list, but somehow this a hit segmentation fault error.
        FileNodeListType::iterator it(fileList.end());

        // following piece basically doing some data manipulation
        // on the element retrieve from the list.
        FileNode *p = &*it;
        (*it).sibling.push_back(*node);

        cout << p << endl; // memory address shows 0x7ffedff8e808
The error is due to the wrong memory address is being accessed. As I verify on the element being inserted into the list (only one element is inserted in this test case) is having an address different from the one mention in the code above. Thus, according to the expert, in order to access the correct element located at the end of the list is by doing this:
        FileNodeListType::iterator it(fileList.end());
        FileNode *p = &*(--it);
        cout << p << endl; // memory address shows 0xa01bc0

Monday, October 30, 2017

Remember to clear the memory hold in Boost recursive intrusive

While I'm studying on making an intrusive container right inside an object which which is also an intrusive container. Some sort of nested intrusive container I might think of. Well, I never thought Boost really made one for us, this feature is called recursive intrusive container.

There is some issue with this container. It happens only on Windows, no problem in Linux at all. When removing an object from an intrusive container, make sure the nested intrusive container is clear off before the object is removed. I have run some test case about it.

Here is the declaration of the object for the test case, the sibling member is the recursive intrusive container:
class FileNode : public boost::intrusive::list_base_hook<link_mode<auto_unlink> >
{
public:
    boost::intrusive::list<FileNode, base_hook<list_base_hook<link_mode<auto_unlink> > >, constant_time_size<false> > sibling;

...
...
};
This is the intrusive list declaration:
typedef boost::intrusive::list<FileNode, base_hook<list_base_hook<link_mode<auto_unlink> > >, constant_time_size<false> > FileNodeListType;

class FileBot
{
public:
    FileNodeListType fileList;
 
First, an object was pumped into the intrusive container, fileList in this case. Then allocate another object into recursive list, sibling. I loop through every object in the fileList and put in another object into the recursive list.
void FileBot::addFileNode()
{
    FileNode *f1 = new FileNode();
    f1->setName("ASD");
    fileList.push_back(*f1);
}

void FileBot::addSubFileNode()
{
    FileNode *f2 = NULL;

    for(FileNode &T : fileList)
    {
        f2 = new FileNode();
        f2->setName("QWE");
        T.sibling.push_back(*f2);
    }
}
Now I want to clear everything in the fileList, without knowing is there any object saved into the container yet. Basically I just doesn't care at all, I just want to clear the container. Thus I implemented the following code:
void FileBot::clearMemory()
{
   fileList.erase_and_dispose(fileList.begin(), fileList.end(), DisposeFileNode());
}
When I call this method, memory leak is detected. This happens only on Windows. This is my mistake. I overlook on the addSubFileNode() on how I extract the object, T from the fileList. Each object has its sibling, and each sibling has another FileNode object in it, which consume memory as well. For my case, the object inside sibling memory is not removed but the object in fileList memory has already removed. Put in other words, it just like the parent object contain the child object, when releasing the memory, the child's object must release first then only release the parent's object. And what I did here is I go straight to the release the parent object, thus the child object is now complaining about the memory leak.

So in order to play safe in this game, I'll do this:
void FileBot::clearMemory()
{
    for(FileNode &T : fileList) {
        T.sibling.erase_and_dispose(T.sibling.begin(), T.sibling.end(), DisposeFileNode());
    }

    fileList.erase_and_dispose(fileList.begin(), fileList.end(), DisposeFileNode());
}
Clean off the memory in the recursive intrusive container before cleaning the memory in the parent intrusive container. This should be the strategy for this game.

Sunday, October 29, 2017

Removing an item from the Boost Intrusive list

Over the pass few weeks, I was reading the a lot of articles about the intrusive list, its benefits of using it, and its use case example on game development. This brings me a strong message that there are so many peoples on Earth promoting this container. What was so special about this container? Due to the limitation on the container provided by STL and intrusive container's low memory footprint which could benefit to the application's performance.

Does Boost provide this container. Yes! During this great moment with a sudden impulse, I begin my journey to develop a demo program using the Boost Intrusive container. Tadaa~

So far the journey was up and down, a lot of road pump, and feel tired to continue. When trying to remove an item from the list using following method will not work. Before that, let's presume I have the following declaration in the header file:
typedef boost::intrusive::list<FileNode, base_hook<list_base_hook<link_mode<auto_unlink> > >, constant_time_size<false> > FileNodeListType;

class FileBot
{
private:
    FileNodeListType fileList;

    ...
    ...
};
Then I begin to insert something into the list, assume the object type is a FileNode, do take note that I do this purely in one method:
void FileBot::addFileNode()
{
    FileNode *f1 = new FileNode();
    f1->setName("ASD");
    fileList.push_back(*f1);

    cout << "addFileNode(): memory address of insert object: " << f1 << endl;
}
Then I have another method to remove an object from the list:
void FileBot::clearMemoryByName(string filename)
{
    cout << "before delete: " << fileList.size() << endl;

    FileNode *p = new FileNode();
    p->setName("ASD");

    fileList.erase(FileNodeListType::s_iterator_to(*p));
    delete p;

    cout << "clearMemoryByName(): memory address of remove object: " << p << endl;

    cout << "after delete: " << fileList.size() << endl;
}
But somehow this method will cause a memory leak due to the following assertion failed in boost/intrusive/list.hpp, line 1270:
Assertion failed: !node_algorithms::inited(value_traits::to_node_ptr(value)), file D:\tool\boost_1_61_0\msvc\boost/intrusive/list.hpp, line 1270
Detected memory leaks!
Dumping objects ->
...
...
...
Object dump complete.
Press <return> to close this window...
A million hours were spent on studying the weak spot of this creature has finally found a solution. The clue was lying on the memory address of the object. The memory address of the object inserted into the list in addFileNode() was different from the memory object created in the clearMemoryByName().
...

addFileNode(): memory address of insert object: 001180C8
clearMemoryByName(): memory address of remove object: 00111DE0

...
In the revise's version, the object was retrieved from the list and then perform some filtering (e.g. match the property field's value), then only perform the deletion.
void FileBot::clearMemoryByName(string filename)
{
    FileNode *p = NULL;
    FileNodeListType::iterator it(fileList.begin());

    for(; it != fileList.end(); it++)
    {
        cout << &*it << endl;

        // look for particular node
        string curName = (*it).getName().c_str();
        if( filename == curName) {
            p = &*it;
            break;
        }
    }

    cout << "claerMemoryByName(): memory address of remove object: " << p << endl;

    fileList.erase(FileNodeListType::s_iterator_to(*p));
    delete p;
The result was stunning! The object was removed from the list. And here is the output of the memory address:
...

addFileNode(): memory address of insert object: 00E2DAF0
claerMemoryByName(): memory address of remove object: 00E2DAF0

...
To conclude this, in order to remove an object from the list, O(n) algorithm is the only way to filter the object, then only come to deletion.

Saturday, October 14, 2017

Convert Windows file separator into Linux format

I wrote a function to extract the parent path for a given path. An interesting thing happens here; Windows's file system was used (\) backslash as file separator, whereas Linux's using (/) slash as file separator. Thus, this will end up I have following code produce:
void FileBot::constructParentPath()
{
   ...

   string sep = string(1, boost::filesystem::path::preferred_separator);
   if( sep == "/" )
      boost::replace_last(filePath, "/", "|");
   else
      boost::replace_last(filePath, "\", "|");

   ...
}
When handling file separator with Boost library, the library will able to convert the Windows's file separator into Linux format. Therefore, my code will reduce into one line from the previous version:
void FileBot::constructParentPath()
{
   ...

   string filePath = fileNode.getName().generic_string();
   boost::replace_last(filePath, "/", "|");

   ...
}

Sunday, September 24, 2017

A problem with the SOAP logical tree format

Just one small step will complete my finding. It was the web service development on Message Broker. Ever wonder how things done and I always got an excuse to delay my finding.

Say I have a WSDL, which consist of following:
...

<wsdl:types>
  <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://helloWorld.huahsin.org/">
    <xsd:element name="Input">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="name" type="xsd:string" minOccurs="1"/>
          <xsd:element name="age" type="xsd:integer" minOccurs="0"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
    <xsd:element name="Output">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="greetings" type="xsd:string" minOccurs="1"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:schema>
</wsdl:types>

...
Ignore those unnecessary stuff, let's focus only on the input and output. The above WSDL shows the following request input:
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:hel="http://helloWorld.huahsin.org/">
   <soap:Header/>
   <soap:Body>
      <hel:Input>
         <name>LKJ</name>
         <!--Optional:-->
         <age>0</age>
      </hel:Input>
   </soap:Body>
</soap:Envelope>
And the expected response message:
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:hel="http://helloWorld.huahsin.org/">
   <soap:Body>
      <hel:Output>
         <greetings>Hello World</greetings>
      </hel:Output>
   </soap:Body>
</soap:Envelope>
That's all about it. Somehow, things always out of expectation. I got this:
<soapenv:Envelope xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
   <soapenv:Body>
      <soapenv:Fault>
         <soapenv:Code>
            <soapenv:Value>soapenv:Receiver</soapenv:Value>
         </soapenv:Code>
         <soapenv:Reason>
            <soapenv:Text xml:lang="en">BIP3113E: Exception detected in message flow Main.SOAP Input (broker brk2)</soapenv:Text>
         </soapenv:Reason>
         <soapenv:Node>://localhost:7080/HelloWorld</soapenv:Node>
         <soapenv:Detail>
            <Text>BIP3752E: The SOAP Reply node 'Main.SOAP Reply' encountered an error while processing a reply message. 
An error occurred during reply message processing. 
See previous error messages to determine the cause of the error. : F:\build\slot1\S800_P\src\WebServices\WSLibrary\ImbSOAPReplyNode.cpp: 397: ImbSOAPReplyNode::evaluate: ComIbmSOAPReplyNode: Main#FCMComposite_1_2
BIP3605E: The SOAP logical tree cannot be serialized. 
There is a problem with the SOAP logical tree format. 
Review further error messages for an indication to the cause of the error. Check that the SOAP logical supplied is correctly formatted. : F:\build\slot1\S800_P\src\WebServices\WSLibrary\ImbSOAPParser.cpp: 1360: ImbSOAPParser::refreshBitStreamFromElementsInner: : 
BIP3602E: The Web service payload ''{http://helloWorld.huahsin.org/}Input'' does not match an operation described by WSDL binding ''HelloWorldBinding'' in file ''huahsin.wsdl''. 
The first child of the SOAP Body does not correspond to any of the operations defined in the specified WSDL definition. 
Check that the correct WSDL definition was deployed. : F:\build\slot1\S800_P\src\WebServices\WSLibrary\ImbSOAPParser.cpp: 790: ImbSOAPParser::refreshBitStreamFromElementsInner: :</Text>
         </soapenv:Detail>
      </soapenv:Fault>
   </soapenv:Body>
</soapenv:Envelope>
At first I was so struggling to work on this error, luckily I got the solution during my work, The more I work, the more knowledge I learn. This was due to the message construction happened in Transform Node:
CREATE COMPUTE MODULE Main_Compute
  CREATE FUNCTION Main() RETURNS BOOLEAN
  BEGIN
    -- CALL CopyMessageHeaders();
    CALL CopyEntireMessage();

    DECLARE hel NAMESPACE 'http://helloWorld.huahsin.org/';

    SET OutputRoot.SOAP.Body.hel:Output.greetings = 'Hello World';

    RETURN TRUE;
  END;

  ...
END MODULE;
The error is from the OutputRoot. OutputRoot is the ultimate destination that the message will be reach at the end of the flow. During the message construction, any message that going to send out a response are advisable to clear them out. Like this:
    ...

    DELETE FIELD OutputRoot.SOAP.Body;
    SET OutputRoot.SOAP.Body.hel:Output.greetings = 'Hello World';

    ...
Leave the SOAP tree empty before any new message is constructed.