[jira] Created: (DDLUTILS-258) Degradation of performance of Table when using DdlUtils with a large dataset

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (DDLUTILS-258) Degradation of performance of Table when using DdlUtils with a large dataset

JIRA jira@apache.org
Degradation of performance of Table when using DdlUtils with a large dataset
----------------------------------------------------------------------------

                 Key: DDLUTILS-258
                 URL: https://issues.apache.org/jira/browse/DDLUTILS-258
             Project: DdlUtils
          Issue Type: Improvement
          Components: Core (No specific database)
    Affects Versions: 1.1
            Reporter: Tom Palmer
            Assignee: Thomas Dudziak
            Priority: Minor


When using DdlUtils to load large amounts of data I've observed a drop-off in performance originating in the Table class. It uses an ArrayList to store its columns, foreign keys and indicies but treats the contents as a Set within the equals, hashCode and sortForeignKeys. On one particular project I've observed up to 2 minutes of the overall load time being consumed in these methods.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (DDLUTILS-258) Degradation of performance of Table when using DdlUtils with a large dataset

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/DDLUTILS-258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom Palmer updated DDLUTILS-258:
--------------------------------

    Attachment: ListOrderedSet.patch

The attached patch changes the _columns, _foreignKeys and _indicies fields to use the Commons Collections ListOrderedSet. This allows the order to be preserved as well as ensuring uniqueness of the elements. With this change I've been able to remove the 4x HashSet creations in equals and hashCode. sortForeignKeys must now create a temporary ArrayList to perform the sorting but this is called far less often so I don't believe this to be a problem.

> Degradation of performance of Table when using DdlUtils with a large dataset
> ----------------------------------------------------------------------------
>
>                 Key: DDLUTILS-258
>                 URL: https://issues.apache.org/jira/browse/DDLUTILS-258
>             Project: DdlUtils
>          Issue Type: Improvement
>          Components: Core (No specific database)
>    Affects Versions: 1.1
>            Reporter: Tom Palmer
>            Assignee: Thomas Dudziak
>            Priority: Minor
>         Attachments: ListOrderedSet.patch
>
>
> When using DdlUtils to load large amounts of data I've observed a drop-off in performance originating in the Table class. It uses an ArrayList to store its columns, foreign keys and indicies but treats the contents as a Set within the equals, hashCode and sortForeignKeys. On one particular project I've observed up to 2 minutes of the overall load time being consumed in these methods.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.