Your Privacy Matters: We use our own and third-party cookies to improve your experience on our website. By continuing to use the website we understand that you accept their use. Cookie Policy
240
Speed of IgxExcelCoreModule Workbook.Load
posted

We have an issue regarding the IgxExcelCoreModule's speed in our project.  For our purposes, we need to read in the names of each sheet within the workbook and then the headers of each sheet.  The following code is what we used to achieve that functionality:

if (event.target.files.length > 0) {
      this.importFile = event.target.files[0];

      //// Infragistics
      this.headersByWorksheetList = [];
      this.availableWorksheetList = [];

      Workbook.load(this.importFile, null,
        (workbook) => {
          console.log("load")
          // Access the worksheets
          for (let i = 0; i < workbook.worksheets().count; i++) {
            const worksheet = workbook.worksheets(i);
            this.availableWorksheetList.push(worksheet.name);
            const headers = [];
            const firstRow = worksheet.rows(0);
            for (let cellIndex = 0; cellIndex < firstRow.cells().count; cellIndex++) {
              headers.push(firstRow.cells(cellIndex).value)
            }
            this.headersByWorksheetList.push(new HeadersByWorkSheet(worksheet.name, headers));
            // console.log(`Worksheet ${i + 1}: ${worksheet.name}`);
          }

          console.log(this.headersByWorksheetList);

        }, (error) => {
          // Error callback: Handle any errors
          console.error('Error loading workbook:', error);
      });
      }

However, doing this with a reasonably small workbook (1700KB) takes ten plus seconds to load and with a larger one (21MB) it took around 4-5 mins, which is something we would potentially need to do.  This seems very unwieldy considering the simplicity of the functionality required. And this is just to get a response from the Workbook.load function, not even the loop within it - this is the time it takes to print console.log("load").

By contrast, using the XLSX library here sheetjs/README.md at master - sheetjs - SheetJS I tried the following code to perform the same action:

if (event.target.files.length > 0) {
      this.importFile = event.target.files[0];

      

      ////// XLSX
      const file: File = event.target.files[0];
      this.importFile = file;

      const reader = new FileReader();

      reader.onload = (e: any) => {
        /* Read the binary string */
        const bstr = e.target.result;
        const workbook = XLSX.read(bstr, { type: 'binary' });

        // this.worksheetList = workbook.SheetNames;
        this.availableWorksheetList = workbook.SheetNames;

        this.activeHeadersList = [];
        this.headersByWorksheetList = [];
        this.resetAllFields();

        workbook.SheetNames.forEach(sheetName => {
          const currentWorksheet = workbook.Sheets[sheetName];
          const data = XLSX.utils.sheet_to_json(currentWorksheet, { header: 1 });

          // Check column headers
          if (data.length > 0) {
            // this.worksheetList = [...this.worksheetList, ...workbook.SheetNames];

            this.headersByWorksheetList.push(new HeadersByWorkSheet(sheetName, data[0] as string[])); // First row contains headers
          }
        });
      }

      reader.readAsArrayBuffer(file);
    }

This one performed the same action in under a second, which is much more reasonable.  I wondered if there was something we could do to potentially replicate these speeds using the Infragistics Excel library?

Thanks!

Parents
  • 500
    Verified Answer
    Offline posted

    Hello,

    Current Behavior and Constraints

    The IgxExcelCoreModule from Infragistics offers robust functionality for interacting with Excel workbooks. However, it creates a comprehensive object model of the workbook upon loading. This model includes all sheets, cells, styles, formulas, and metadata, which is why it takes significant time when processing large workbooks.

    By contrast, SheetJS (XLSX) does not build a full object model. It focuses only on essential parts of the workbook, like headers and sheet names, resulting in faster performance. This difference in approach explains the significant discrepancy in load times.

    Capabilities of IgxExcelCoreModule

    1. Detailed Workbook Model: Provides a structured and detailed object model of Excel workbooks.
    2. Extensive API Support: Enables complex operations like formatting, formulas, and cell-level manipulation.
    3. Compatibility: Ideal for scenarios where a detailed representation of the workbook is required.

    Limitations in Your Use Case

    • Full Workbook Processing: Even when you need only sheet names and headers, the library processes the entire workbook, which causes delays.
    • No Selective Loading: Currently, there is no functionality to load specific parts of the workbook, like just the sheet names or first-row headers.

    Why Is This Happening?

    The underlying design of the IgxExcelCoreModule prioritizes a rich feature set for advanced workbook manipulations. This results in more resource-intensive operations. Libraries like SheetJS are optimized for simpler tasks, such as quick access to sheet names and headers, without building a detailed model.

    Potential Solutions

    1. Optimize Workbook Design

    If you have control over how the workbooks are created:

    • Reduce the number of sheets.
    • Limit the use of complex formulas.
    • Simplify formatting to make the workbook lighter and faster to process.
    1. Improve System Resources
    • Ensure that the application runs in an environment with adequate memory and CPU power, as these significantly impact processing times.
    1. Use IgxExcelCoreModule Efficiently
    • Asynchronous Loading: Ensure the Workbook.load operation is handled asynchronously without blocking the main thread, minimizing user-perceived delays.
    • Batch Processing: If possible, split large workbooks into smaller chunks or process them in segments.
    1. Customize Your Code for Specific Needs

    For quick access to sheet names and headers, you can optimize the existing code by leveraging internal features of IgxExcelCoreModule or pre-filtering the data where possible.

    Suggested Approaches to Address the Issue

    To address the performance challenges while continuing to use the IgxExcelCoreModule, here are practical approaches tailored to your use case. These are intended to optimize performance and improve user experience.

    1. Optimize for Large Files
    • Simplify the Workbook Structure: Reduce the number of sheets, avoid complex formulas, and limit heavy formatting where possible.
    • Pre-Processing: If feasible, pre-process the Excel files to remove unnecessary data or split large workbooks into smaller, more manageable ones.
    1. Partial Data Extraction with Staged Loading
    • Sheet Names First: Extract and display sheet names immediately after loading begins to give users actionable information quickly.
    • Headers Later: Process headers in the background or on-demand when the user selects a specific sheet.
    1. Optimize Workbook Handling
    • Process Sheets Individually: Instead of processing the entire workbook at once, load and process only the sheet required at any given time.
    • Selective Data Access: Focus only on the rows or columns that are necessary for your operation, minimizing resource usage.
    1. Provide Feedback to Users
    • Loading Indicators: Display a progress spinner or status message while the workbook is loading to enhance user experience.
    • Estimated Time Display: Inform users about the expected duration for the operation, especially for large workbooks.
    1. Hybrid Approach
    • Combine lightweight methods for quick access to metadata (e.g., sheet names) with IgxExcelCoreModule for detailed operations.
    • For example:
      • Use a lightweight parser to extract sheet names and basic information quickly.
      • Use IgxExcelCoreModule to load and manipulate workbook details when necessary.

    Request for Isolated Sample

    To better analyze the specific behavior you are encountering, I kindly request an isolated sample project that replicates the issue. The sample should include:

    1. A minimal implementation of the IgxExcelCoreModule with the described functionality.
    2. A representative Excel workbook file (e.g., one that is 1.7MB and another that is 21MB).
    3. Steps to reproduce the behavior.

    Conclusion

    Given the current capabilities of the IgxExcelCoreModule, the above approaches will help optimize the performance while preserving its detailed functionality.

    I encourage you to create an isolated sample project that replicates the issue using the guidelines provided earlier. This will allow us to analyze the problem thoroughly and provide more targeted optimizations. Thank you for your understanding, and I look forward to working together on this!

    Regards,

    Georgi Anastasov

    Associate Software Developer

    Infragistics

  • 240
    Offline posted in reply to Georgi Anastasov

    Hi,

     

    Thanks very much for getting back to me with such a detailed response.

     

    I think for the sake of our requirements, the simplest outcome is to stick with the latter solution I originally posted using the XLSX library since it provides us the limited functionality we require with the advantage of being considerably faster.  The bulk of our excel manipulation (imports and exports) is handled on the backend of our system, so it wouldn’t make sense to try and utilise the Infragistics’ IgxExcelCoreModule to do something simpler than it seems designed to handle.

     

    We’ll be unable to pre-filter the data on the workbooks themselves or break them into batches, as this is the whole point of what we are trying to achieve through this piece of functionality. It should allow the user to upload a potentially large workbook of data and quickly grab its sheet names and headers before sending the necessary DTO to the backend to be handled as the user sees fit while still keeping within the confines of our database requirements.

     

    Thanks again for getting back to me, I will refer back to this if looking to use a more complex and robust frontend solution.

Reply Children
No Data